- SHARED_SECRET now read from env and fail-closed: unset/placeholder refuses ALL plugin connections (constant-time compare). The old hardcoded 'your_shared_secret' in this public repo was no auth at all. Dockerfile default removed; generate_data.py reads the env var. - SECRET_KEY fails closed at startup (main.py and agent/auth.py) instead of falling back to a publicly-known signing key; agent systemd unit now requires /etc/overlord/agent.env (no '-' prefix). - AuthMiddleware + /ws/live: replace the 172.x source-IP trust (which every nginx-proxied internet request satisfied via docker-proxy — full session bypass and unauthenticated in-game command injection) with private-source AND no X-Forwarded-For, i.e. only genuinely internal callers (overlord-agent on the host, compose-network services). Invariant documented in nginx/overlord.conf: every tracker-bound location must set X-Forwarded-For. - /character-stats/test endpoints gated behind admin (they upsert real rows). - docker-compose: bind 5432/5433 to 127.0.0.1 (both DBs were internet- reachable; active brute-force observed in dereth-db logs). - discord-rare-monitor: drop dead SHARED_SECRET constant. - scripts/backup-databases.sh + docs/backups.md: nightly pg_dump of both DBs (telemetry/spawn hypertable data excluded), 10MB canary, umask 077, TimescaleDB restore procedure. - Remove stray mangled-path css file from repo root. Adversarially reviewed pre-deploy (3-lens workflow): ship verdict; deploy- sequencing blockers addressed (secret staged before enforcement, exec bit set, cron uses bash). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
102 lines
4.4 KiB
Markdown
102 lines
4.4 KiB
Markdown
# Database backups
|
||
|
||
Nightly logical backups of both databases, taken by
|
||
[`scripts/backup-databases.sh`](../scripts/backup-databases.sh) via a cron
|
||
job on the live host (user `erik`, who is in the `docker` group — no sudo
|
||
needed). Install with:
|
||
|
||
```
|
||
mkdir -p /home/erik/backups # MUST exist before the first run —
|
||
# cron opens the log redirect before
|
||
# the script's own mkdir executes
|
||
crontab -e # add the line below
|
||
15 3 * * * bash /home/erik/MosswartOverlord/scripts/backup-databases.sh >> /home/erik/backups/backup.log 2>&1
|
||
```
|
||
|
||
Dumps land in `/home/erik/backups/postgres/` as `dereth-YYYYMMDD-HHMM.dump`
|
||
and `inventory-YYYYMMDD-HHMM.dump` (pg_dump custom format, compressed,
|
||
mode 0600). Retention: ~8 days of dailies (`-mtime +7`), pruned by the
|
||
script itself only after a successful run. The nightly `backup.log` will
|
||
contain pg_dump circular-FK warnings about hypertable chunks — those are
|
||
normal; the canary to watch is the printed dump sizes (a healthy dereth
|
||
dump is ~50 MB, and the script aborts if it drops below 10 MB).
|
||
|
||
## What is and isn't included
|
||
|
||
- **dereth** (TimescaleDB): everything EXCEPT the row data of the
|
||
`telemetry_events` and `spawn_events` hypertables (their chunk data in
|
||
`_timescaledb_internal._hyper_*` is excluded). That data is ~12 GB and
|
||
expires through retention policies within 7–30 days anyway. The
|
||
irreplaceable tables — `users`, `char_stats`, `rare_stats`,
|
||
`rare_stats_sessions`, `rare_events`, `combat_stats`,
|
||
`combat_stats_sessions`, `portals`, `character_stats`, `server_status` —
|
||
are fully included. Table *schemas* for the excluded hypertables are
|
||
still dumped, so a restore recreates them empty.
|
||
- **inventory_db**: full dump (items, combat stats, enhancements, spells,
|
||
requirements, ratings, raw JSON).
|
||
|
||
⚠ The `_timescaledb_internal._hyper_*` exclusion drops the chunk data of
|
||
**every** hypertable, present and future. If an irreplaceable table is ever
|
||
converted to a hypertable (or a continuous aggregate is added), revisit the
|
||
exclusion list — otherwise its data silently disappears from backups.
|
||
|
||
## Off-host copies (recommended, not yet automated)
|
||
|
||
The dumps live on the same disk as the databases. Sync them off-host
|
||
periodically, e.g. from another machine:
|
||
|
||
```
|
||
rsync -av erik@overlord.snakedesert.se:backups/postgres/ ./overlord-backups/
|
||
```
|
||
|
||
## Restore
|
||
|
||
### inventory_db (plain Postgres)
|
||
|
||
```bash
|
||
docker exec -i inventory-db pg_restore -U inventory_user -d inventory_db --clean --if-exists < inventory-<stamp>.dump
|
||
```
|
||
|
||
### dereth (TimescaleDB — needs pre/post restore calls)
|
||
|
||
TimescaleDB requires putting the extension into restore mode around the
|
||
`pg_restore`, otherwise catalog rows fail:
|
||
|
||
```bash
|
||
# 1. Create a fresh DB (or use --clean against the existing one)
|
||
docker exec dereth-db psql -U postgres -c "CREATE DATABASE dereth_restore;"
|
||
docker exec dereth-db psql -U postgres -d dereth_restore -c "CREATE EXTENSION IF NOT EXISTS timescaledb;"
|
||
|
||
# 2. Pre-restore mode
|
||
docker exec dereth-db psql -U postgres -d dereth_restore -c "SELECT timescaledb_pre_restore();"
|
||
|
||
# 3. Restore the dump
|
||
docker exec -i dereth-db pg_restore -U postgres -d dereth_restore --no-owner < dereth-<stamp>.dump
|
||
|
||
# 4. Post-restore mode (re-enables background workers, validates catalog)
|
||
docker exec dereth-db psql -U postgres -d dereth_restore -c "SELECT timescaledb_post_restore();"
|
||
```
|
||
|
||
Notes:
|
||
- Step 3 reports one ignorable error — the dump's `CREATE EXTENSION
|
||
timescaledb` collides with the extension pre-created in step 1
|
||
("already exists", `errors ignored on restore: 1`). That is expected,
|
||
not a failed restore.
|
||
- The TimescaleDB **version** at restore time must be the **same** as at
|
||
dump time (restore first, then `ALTER EXTENSION timescaledb UPDATE` if
|
||
upgrading). Same-container restores with the image pinned in
|
||
docker-compose.yml (`timescale/timescaledb:2.19.3-pg14`) are fine.
|
||
|
||
Then either point `DATABASE_URL` at the restored DB or rename databases.
|
||
The `telemetry_events`/`spawn_events` hypertables come back empty (by
|
||
design); retention/compression policies are part of the dump and reattach.
|
||
|
||
## Verifying a backup
|
||
|
||
```bash
|
||
pg_restore --list dereth-<stamp>.dump | head # table of contents
|
||
pg_restore --list dereth-<stamp>.dump | grep -c 'TABLE DATA'
|
||
```
|
||
|
||
A dump that suddenly shrinks dramatically (check `backup.log` sizes) is the
|
||
canary for silent failure.
|