MosswartOverlord/nginx/overlord.conf
Erik a28b61511c security: enforce real plugin secret, fix proxy auth bypass, loopback DB ports, nightly backups
- SHARED_SECRET now read from env and fail-closed: unset/placeholder refuses
  ALL plugin connections (constant-time compare). The old hardcoded
  'your_shared_secret' in this public repo was no auth at all. Dockerfile
  default removed; generate_data.py reads the env var.
- SECRET_KEY fails closed at startup (main.py and agent/auth.py) instead of
  falling back to a publicly-known signing key; agent systemd unit now
  requires /etc/overlord/agent.env (no '-' prefix).
- AuthMiddleware + /ws/live: replace the 172.x source-IP trust (which every
  nginx-proxied internet request satisfied via docker-proxy — full session
  bypass and unauthenticated in-game command injection) with
  private-source AND no X-Forwarded-For, i.e. only genuinely internal
  callers (overlord-agent on the host, compose-network services). Invariant
  documented in nginx/overlord.conf: every tracker-bound location must set
  X-Forwarded-For.
- /character-stats/test endpoints gated behind admin (they upsert real rows).
- docker-compose: bind 5432/5433 to 127.0.0.1 (both DBs were internet-
  reachable; active brute-force observed in dereth-db logs).
- discord-rare-monitor: drop dead SHARED_SECRET constant.
- scripts/backup-databases.sh + docs/backups.md: nightly pg_dump of both DBs
  (telemetry/spawn hypertable data excluded), 10MB canary, umask 077,
  TimescaleDB restore procedure.
- Remove stray mangled-path css file from repo root.

Adversarially reviewed pre-deploy (3-lens workflow): ship verdict; deploy-
sequencing blockers addressed (secret staged before enforcement, exec bit
set, cron uses bash).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 17:02:47 +02:00

125 lines
5.9 KiB
Text

# Nginx site config for overlord.snakedesert.se
#
# Lives on the host (not in the Docker stack) at:
# /etc/nginx/sites-enabled/overlord
#
# This file is the source-of-truth copy committed to git. To deploy a change:
# 1. Edit this file in the repo
# 2. SSH to the host
# 3. sudo cp /home/erik/MosswartOverlord/nginx/overlord.conf /etc/nginx/sites-enabled/overlord
# 4. sudo nginx -t && sudo nginx -s reload
#
# Critical settings:
# - proxy_read_timeout / proxy_send_timeout 1d on /websocket/ and /
# WebSockets are long-lived; nginx's default 60s timeout drops idle clients.
# Removing these timeouts caused all plugin connections to drop every
# ~60s when no data flowed from backend to client (April 2026 incident).
# - SECURITY INVARIANT: every location that proxies to the `tracker`
# upstream MUST set proxy_set_header X-Forwarded-For. The backend treats
# a private-source request WITHOUT that header as internal (host/compose
# callers) and skips session auth — a tracker-bound location that forgot
# the header would silently bypass login for the whole internet. This
# includes any future port-80 or alternate server block.
# - /grafana/ panel embeds rely on Grafana's anonymous Viewer auth
# (GF_AUTH_ANONYMOUS_ENABLED=true in docker-compose.yml) — no credentials
# in this file. Do NOT hardcode tokens here: this file is committed to a
# public repo, and Grafana's state DB is ephemeral container storage, so
# service-account tokens get orphaned on every container recreate. A
# previously committed (long-dead) token was removed in June 2026.
server {
listen 443 ssl;
server_name overlord.snakedesert.se;
# Security hardening
server_tokens off;
add_header X-Frame-Options SAMEORIGIN always;
add_header X-Content-Type-Options nosniff always;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header Referrer-Policy strict-origin-when-cross-origin always;
add_header Permissions-Policy "geolocation=(), camera=(), microphone=()" always;
# SSL certificates
ssl_certificate /etc/letsencrypt/live/overlord.snakedesert.se/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/overlord.snakedesert.se/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
# Plugin WebSocket ingest — `/ws/position` upstream
location /websocket/ {
proxy_pass http://tracker/ws/position;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header X-Plugin-Secret $http_x_plugin_secret;
proxy_cache_bypass $http_upgrade;
# Long-lived WebSocket: don't time out the proxy
proxy_read_timeout 1d;
proxy_send_timeout 1d;
}
# Overlord Agent — host-side service running OUTSIDE the Docker stack
# because it shells out to `claude` which depends on host-side
# ~/.claude credentials. Long timeout because agent calls can spin
# while Claude Code chains tool invocations.
location /api/agent/ {
proxy_pass http://127.0.0.1:8767/agent/;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass_request_headers on;
# Heavy tool calls (cross-char search, suitbuilder) can take a while;
# the python wrapper caps each turn at 240s, so 300s gives some
# headroom for the round trip.
proxy_read_timeout 300s;
proxy_send_timeout 300s;
}
# API endpoints (live, trails, history, stats) — short-lived HTTP
location /api/ {
proxy_pass http://tracker/;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_cache_bypass $http_upgrade;
}
# Frontend UI and browser WebSocket (`/ws/live` upstream)
location / {
proxy_pass http://tracker/;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_cache_bypass $http_upgrade;
# Long-lived browser WebSocket (/ws/live): don't time out
proxy_read_timeout 1d;
proxy_send_timeout 1d;
}
# Grafana Dashboard UI (served under /grafana)
location /grafana/ {
proxy_pass http://grafana;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_cache_bypass $http_upgrade;
}
}