- SHARED_SECRET now read from env and fail-closed: unset/placeholder refuses ALL plugin connections (constant-time compare). The old hardcoded 'your_shared_secret' in this public repo was no auth at all. Dockerfile default removed; generate_data.py reads the env var. - SECRET_KEY fails closed at startup (main.py and agent/auth.py) instead of falling back to a publicly-known signing key; agent systemd unit now requires /etc/overlord/agent.env (no '-' prefix). - AuthMiddleware + /ws/live: replace the 172.x source-IP trust (which every nginx-proxied internet request satisfied via docker-proxy — full session bypass and unauthenticated in-game command injection) with private-source AND no X-Forwarded-For, i.e. only genuinely internal callers (overlord-agent on the host, compose-network services). Invariant documented in nginx/overlord.conf: every tracker-bound location must set X-Forwarded-For. - /character-stats/test endpoints gated behind admin (they upsert real rows). - docker-compose: bind 5432/5433 to 127.0.0.1 (both DBs were internet- reachable; active brute-force observed in dereth-db logs). - discord-rare-monitor: drop dead SHARED_SECRET constant. - scripts/backup-databases.sh + docs/backups.md: nightly pg_dump of both DBs (telemetry/spawn hypertable data excluded), 10MB canary, umask 077, TimescaleDB restore procedure. - Remove stray mangled-path css file from repo root. Adversarially reviewed pre-deploy (3-lens workflow): ship verdict; deploy- sequencing blockers addressed (secret staged before enforcement, exec bit set, cron uses bash). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
4.4 KiB
Database backups
Nightly logical backups of both databases, taken by
scripts/backup-databases.sh via a cron
job on the live host (user erik, who is in the docker group — no sudo
needed). Install with:
mkdir -p /home/erik/backups # MUST exist before the first run —
# cron opens the log redirect before
# the script's own mkdir executes
crontab -e # add the line below
15 3 * * * bash /home/erik/MosswartOverlord/scripts/backup-databases.sh >> /home/erik/backups/backup.log 2>&1
Dumps land in /home/erik/backups/postgres/ as dereth-YYYYMMDD-HHMM.dump
and inventory-YYYYMMDD-HHMM.dump (pg_dump custom format, compressed,
mode 0600). Retention: ~8 days of dailies (-mtime +7), pruned by the
script itself only after a successful run. The nightly backup.log will
contain pg_dump circular-FK warnings about hypertable chunks — those are
normal; the canary to watch is the printed dump sizes (a healthy dereth
dump is ~50 MB, and the script aborts if it drops below 10 MB).
What is and isn't included
- dereth (TimescaleDB): everything EXCEPT the row data of the
telemetry_eventsandspawn_eventshypertables (their chunk data in_timescaledb_internal._hyper_*is excluded). That data is ~12 GB and expires through retention policies within 7–30 days anyway. The irreplaceable tables —users,char_stats,rare_stats,rare_stats_sessions,rare_events,combat_stats,combat_stats_sessions,portals,character_stats,server_status— are fully included. Table schemas for the excluded hypertables are still dumped, so a restore recreates them empty. - inventory_db: full dump (items, combat stats, enhancements, spells, requirements, ratings, raw JSON).
⚠ The _timescaledb_internal._hyper_* exclusion drops the chunk data of
every hypertable, present and future. If an irreplaceable table is ever
converted to a hypertable (or a continuous aggregate is added), revisit the
exclusion list — otherwise its data silently disappears from backups.
Off-host copies (recommended, not yet automated)
The dumps live on the same disk as the databases. Sync them off-host periodically, e.g. from another machine:
rsync -av erik@overlord.snakedesert.se:backups/postgres/ ./overlord-backups/
Restore
inventory_db (plain Postgres)
docker exec -i inventory-db pg_restore -U inventory_user -d inventory_db --clean --if-exists < inventory-<stamp>.dump
dereth (TimescaleDB — needs pre/post restore calls)
TimescaleDB requires putting the extension into restore mode around the
pg_restore, otherwise catalog rows fail:
# 1. Create a fresh DB (or use --clean against the existing one)
docker exec dereth-db psql -U postgres -c "CREATE DATABASE dereth_restore;"
docker exec dereth-db psql -U postgres -d dereth_restore -c "CREATE EXTENSION IF NOT EXISTS timescaledb;"
# 2. Pre-restore mode
docker exec dereth-db psql -U postgres -d dereth_restore -c "SELECT timescaledb_pre_restore();"
# 3. Restore the dump
docker exec -i dereth-db pg_restore -U postgres -d dereth_restore --no-owner < dereth-<stamp>.dump
# 4. Post-restore mode (re-enables background workers, validates catalog)
docker exec dereth-db psql -U postgres -d dereth_restore -c "SELECT timescaledb_post_restore();"
Notes:
- Step 3 reports one ignorable error — the dump's
CREATE EXTENSION timescaledbcollides with the extension pre-created in step 1 ("already exists",errors ignored on restore: 1). That is expected, not a failed restore. - The TimescaleDB version at restore time must be the same as at
dump time (restore first, then
ALTER EXTENSION timescaledb UPDATEif upgrading). Same-container restores with the image pinned in docker-compose.yml (timescale/timescaledb:2.19.3-pg14) are fine.
Then either point DATABASE_URL at the restored DB or rename databases.
The telemetry_events/spawn_events hypertables come back empty (by
design); retention/compression policies are part of the dump and reattach.
Verifying a backup
pg_restore --list dereth-<stamp>.dump | head # table of contents
pg_restore --list dereth-<stamp>.dump | grep -c 'TABLE DATA'
A dump that suddenly shrinks dramatically (check backup.log sizes) is the
canary for silent failure.