Proxmox vzdump (weekly)
│ zstd compressed .vma.zst
▼
TrueNAS staging (/mnt/nas/proxmox-backup/dump)
keep-last=1 per VM
│ NFS mount → K8s Syncthing pod (Send Only)
▼
Synology NAS (/volume1/docker/proxmox-backup/dump)
Receive Only + Ignore Deletes → 14-day retention
Key design choices:
Four VMs are backed up weekly (Saturday nights), staggered to avoid concurrent large writes to TrueNAS NFS:
| Job | VMs | Schedule | Start Time |
|---|---|---|---|
backup-all |
103, 104, 107, 108 | Saturday | 02:00 |
Wait — actually two separate jobs exist (they were split due to how vzdump prune-backups works per-job):
| Job ID | VMs | Schedule | Compression | Retention |
|---|---|---|---|---|
backup-all |
103, 104, 107 | Sat 02:00 | zstd | keep-last=1 |
backup-node2 |
108 | Sat 02:30 | zstd | keep-last=1 |
keep-last=1means Proxmox auto-prunes after each successful backup, leaving exactly 1 backup per VM on TrueNAS. The prune runs at the end of the backup job automatically.
| Setting | Value |
|---|---|
| Type | NFS |
| Server | 192.168.88.230 (TrueNAS Scale) |
| Export path | /mnt/nas/proxmox-backup/dump |
| Mount options | vers=4.2,soft,timeo=600,retrans=5 |
| Quota | 700 GiB (sufficient for 4 VMs × ~60 GB each with room for overlap during backup runs) |
The default Proxmox NFS mount used timeo=50 (5-second timeout). Proxmox VM backups write large zstd streams directly to NFS. TrueNAS ZFS commits data in bursts — during a flush of a 40–100 GB file, writes can stall for longer than 5 seconds, causing:
E: got a write error: Input/output error
Fix applied (2026-03-07): timeo=600,retrans=5 — 60-second timeout, 5 retries. Edit in PVE datacenter storage config or /etc/pve/storage.cfg.
Verify the mount options are active:
ssh [email protected] "mount | grep truenas"
# Should show: timeo=600,retrans=5
# Via Proxmox dashboard at http://192.168.88.100:9099/
# Backup Status card shows per-VM result and next run time
# Via pvesh CLI on Proxmox host
ssh [email protected]
pvesh get /cluster/backup
pvesh get /nodes/andy/tasks --typefilter vzdump --limit 10
# Check what's on TrueNAS staging
pvesh get /nodes/andy/storage/truenas-backup/content --content backup
K8s cluster
┌─────────────────────────────────────────────────┐
│ Namespace: syncthing │
│ │
│ Deployment: syncthing │
│ ┌─────────────────────────────────────────┐ │
│ │ syncthing/syncthing:latest │ │
│ │ runAsUser: 0 (reads NFS files as root) │ │
│ │ │ │
│ │ /var/syncthing/config ← Longhorn PVC │ │
│ │ /var/syncthing/data ← TrueNAS NFS │ │
│ │ (PV static) │ │
│ │ Folder mode: Send Only │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Service: syncthing (ClusterIP :8384) │
│ Service: syncthing-sync (NodePort :32200) │
└─────────────────────────────────────────────────┘
│ TCP 22000 via NodePort 32200
▼
Synology NAS (192.168.88.19)
┌─────────────────────────────────────────────────┐
│ Docker stack: syncthing │
│ │
│ container: syncthing │
│ ┌─────────────────────────────────────────┐ │
│ │ Port 8384 (GUI), 22000 (sync) │ │
│ │ /var/syncthing/config → bind mount │ │
│ │ /volume1/docker/syncthing/config │ │
│ │ /var/syncthing/data → bind mount │ │
│ │ /volume1/docker/proxmox-backup/dump │ │
│ │ Folder mode: Receive Only + IgnoreDelete│ │
│ └─────────────────────────────────────────┘ │
│ │
│ container: syncthing-pruner │
│ ┌─────────────────────────────────────────┐ │
│ │ alpine crond — 06:30 daily │ │
│ │ deletes *.vma.zst, *.log, *.notes │ │
│ │ older than 14 days from /data │ │
│ └─────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
| Resource | Name | Namespace | Notes |
|---|---|---|---|
| ArgoCD Application | syncthing |
argocd |
sync-wave 8 |
| PersistentVolume | syncthing-truenas-backup |
— | Static NFS PV, 250Gi |
| PersistentVolumeClaim | syncthing-truenas-backup |
syncthing |
Binds to above PV |
| PersistentVolumeClaim | syncthing-config |
syncthing |
Longhorn 1Gi (persists device ID) |
| Deployment | syncthing |
syncthing |
Recreate strategy, runAsUser 0 |
| Service | syncthing |
syncthing |
ClusterIP, port 8384 (GUI) |
| Service | syncthing-sync |
syncthing |
NodePort 32200, port 22000 (sync) |
| Instance | Device ID |
|---|---|
| K8s pod | SKNA3BI-CNUWXRY-OX22RNI-RVAXFBH-VMO27VE-UTBT5X7-RK5KZA4-AIFIJQZ |
| Synology | YH2QVA4-XQ2SXXC-IEL2BXQ-GXF3OAM-HQJ42NK-KLPPXQN-DYRVUJB-SRCJVQL |
Device IDs are derived from TLS certificates stored in each Syncthing's config directory. They only change if the config directory is wiped.
| Component | Config stored in | Survives |
|---|---|---|
| K8s pod device ID | Longhorn PVC syncthing-config |
Pod restart, node failure, rescheduling |
| Synology device ID | /volume1/docker/syncthing/config (bind mount) |
Container restart, NAS reboot |
| Synology container | restart: unless-stopped |
NAS reboot (Docker auto-starts) |
| K8s deployment | ArgoCD self-heals | Cluster issues |
| Folder pairing config | Persisted in both config dirs | Independent restarts on either side |
# Check K8s pod health
kubectl -n syncthing get pods
kubectl -n syncthing logs deploy/syncthing --tail=20
# Check folder state and peer connection via NAS API (accessible from LAN)
curl -s http://192.168.88.19:8384/rest/system/connections \
-H "X-API-Key: XmLATZaLegYw9DXfR2p7guzSGfRbCjgN" | python3 -m json.tool
curl -s "http://192.168.88.19:8384/rest/db/status?folder=proxmox-backups" \
-H "X-API-Key: XmLATZaLegYw9DXfR2p7guzSGfRbCjgN" | python3 -m json.tool
# Or view Syncthing GUI at http://192.168.88.19:8384
The Proxmox monitoring dashboard at http://192.168.88.100:9099/ also shows the Replication to Synology (Syncthing) card with connection status, folder state, file counts, and pending sync bytes.
If either side's config is wiped (device ID changes), re-pair via API:
K8S_KEY="uNjxtWZtj2biKXZsU4ekZwg7qqtvgZpW"
NAS_KEY="XmLATZaLegYw9DXfR2p7guzSGfRbCjgN"
NEW_K8S_ID="<new device ID from kubectl -n syncthing logs deploy/syncthing>"
NEW_NAS_ID="<new device ID from docker logs syncthing on NAS>"
# On K8s: update NAS device entry
kubectl -n syncthing exec deploy/syncthing -- wget -qO- \
--header="X-API-Key: $K8S_KEY" \
--header="Content-Type: application/json" \
--post-data="{\"deviceID\":\"$NEW_NAS_ID\",\"name\":\"synology-nas\",\"addresses\":[\"tcp://192.168.88.19:22000\"]}" \
http://localhost:8384/rest/config/devices
# On NAS: update K8s device entry
curl -X POST http://192.168.88.19:8384/rest/config/devices \
-H "X-API-Key: $NAS_KEY" \
-H "Content-Type: application/json" \
-d "{\"deviceID\":\"$NEW_K8S_ID\",\"name\":\"k8s-syncthing\",\"addresses\":[\"tcp://192.168.88.11:32200\",\"tcp://192.168.88.12:32200\",\"tcp://192.168.88.13:32200\",\"tcp://192.168.88.14:32200\"]}"
| Location | Tool | Retention | Trigger |
|---|---|---|---|
| TrueNAS staging | Proxmox prune-backups | keep-last=1 | Auto at end of each backup job |
| Synology archive | syncthing-pruner crond | 14 days (mtime) | Daily 06:30 |
Worst-case disk usage:
E: got a write error: Input/output error
Cause: NFS timeo too low for large writes. Verify mount options:
ssh [email protected] "mount | grep truenas | grep timeo"
# Must show timeo=600,retrans=5
If wrong, check /etc/pve/storage.cfg — PVE may have reset mount options.
kubectl -n syncthing get podscurl http://192.168.88.19:8384/rest/system/connections -H "X-API-Key: XmLATZaLegYw9DXfR2p7guzSGfRbCjgN"curl "http://192.168.88.19:8384/rest/db/status?folder=proxmox-backups" -H "X-API-Key: XmLATZaLegYw9DXfR2p7guzSGfRbCjgN"scanning), wait — 257 GB of large files takes time to hash on first runCheck the last backup task log:
ssh [email protected] "pvesh get /nodes/andy/tasks --typefilter vzdump --limit 20"
Then get the specific task UPID and check its log:
pvesh get /nodes/andy/tasks/<UPID>/log