Accepts one legacy secret alongside the real one so existing clients keep
registering while game machines migrate to websocket_secret.txt. Remove
SHARED_SECRET_LEGACY from .env after the rollout.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- SHARED_SECRET now read from env and fail-closed: unset/placeholder refuses
ALL plugin connections (constant-time compare). The old hardcoded
'your_shared_secret' in this public repo was no auth at all. Dockerfile
default removed; generate_data.py reads the env var.
- SECRET_KEY fails closed at startup (main.py and agent/auth.py) instead of
falling back to a publicly-known signing key; agent systemd unit now
requires /etc/overlord/agent.env (no '-' prefix).
- AuthMiddleware + /ws/live: replace the 172.x source-IP trust (which every
nginx-proxied internet request satisfied via docker-proxy — full session
bypass and unauthenticated in-game command injection) with
private-source AND no X-Forwarded-For, i.e. only genuinely internal
callers (overlord-agent on the host, compose-network services). Invariant
documented in nginx/overlord.conf: every tracker-bound location must set
X-Forwarded-For.
- /character-stats/test endpoints gated behind admin (they upsert real rows).
- docker-compose: bind 5432/5433 to 127.0.0.1 (both DBs were internet-
reachable; active brute-force observed in dereth-db logs).
- discord-rare-monitor: drop dead SHARED_SECRET constant.
- scripts/backup-databases.sh + docs/backups.md: nightly pg_dump of both DBs
(telemetry/spawn hypertable data excluded), 10MB canary, umask 077,
TimescaleDB restore procedure.
- Remove stray mangled-path css file from repo root.
Adversarially reviewed pre-deploy (3-lens workflow): ship verdict; deploy-
sequencing blockers addressed (secret staged before enforcement, exec bit
set, cron uses bash).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
timescaledb-tune had configured shared_buffers=96396MB — three times the
physical RAM of the host. The kernel was giving PG everything it could
(~30GB of shared memory), leaving <100MB free for everything else.
This caused the OS page cache to be constantly evicted, every query to
hit disk, and telemetry writes to balloon to 20+ seconds.
New settings (standard 25/50 rule for 32GB):
- shared_buffers: 96GB → 8GB
- effective_cache_size: 16GB (query planner hint)
- work_mem: 16MB per operation
- maintenance_work_mem: 1GB (for vacuum/index)
- max_wal_size: 4GB
Requires a db container restart to take effect.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Database cleanup:
- Converted spawn_events to a TimescaleDB hypertable with 7-day retention.
Previously a regular table growing unbounded — had reached 482M rows/66GB
from June 2025. Manual migration copied last 7 days (12M rows) to a new
hypertable, swapped names, and dropped the old table.
Result: DB shrunk from 77GB → 12GB.
- Dropped server_health_checks table entirely. It was write-only (850K rows,
134MB) — only current state in server_status is actually read. Eliminated
the insert from monitor_server_health().
Telemetry handler cleanup:
- Removed 4 per-message INFO log lines (TELEMETRY_RECEIVED, DB_WRITE_ATTEMPT,
DB_WRITE_SUCCESS, PROCESSING_COMPLETE). At 60+ chars × every 2s = hundreds
of log lines/sec. Replaced with single SLOW_* warnings above 500ms/1000ms
thresholds.
- Removed redundant pool-size introspection (try/except + hasattr) on every
telemetry message — useless noise in the hot path.
- Removed debug cache-miss and kill-delta logs.
Log level:
- docker-compose.yml: dereth-tracker LOG_LEVEL DEBUG → INFO (was dumping
entire inventory_delta JSON payloads for every item update).
- inventory-service LOG_LEVEL DEBUG → INFO.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Browser WS clients can now send {"type": "subscribe", "message_types": [...]}
to only receive specific message types. Default is all (no change for browsers).
- Discord bot subscribes to only "rare" and "chat" — eliminates 82GB+ of
unnecessary telemetry/vitals/inventory traffic.
- Idle detection now has a 5-minute grace period before firing Discord alerts,
preventing false positives on brief idle states.
- Added DISCORD_ACLOG_WEBHOOK env var to docker-compose.yml for death/idle alerts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace Nginx basic auth with proper user accounts:
- Session cookies via itsdangerous (30-day expiry, httponly, secure)
- Password hashing with bcrypt via passlib
- Login page with AC-themed UI
- Admin page for user management (CRUD)
- AuthMiddleware exempts plugin WS and browser WS endpoints
- Issues/comments author auto-populated from session
- Sidebar shows logged-in username, admin link, and logout
- Seed users: erik (admin), alex, lundberg
- SECRET_KEY env var for cookie signing