security: enforce real plugin secret, fix proxy auth bypass, loopback DB ports, nightly backups

- SHARED_SECRET now read from env and fail-closed: unset/placeholder refuses
  ALL plugin connections (constant-time compare). The old hardcoded
  'your_shared_secret' in this public repo was no auth at all. Dockerfile
  default removed; generate_data.py reads the env var.
- SECRET_KEY fails closed at startup (main.py and agent/auth.py) instead of
  falling back to a publicly-known signing key; agent systemd unit now
  requires /etc/overlord/agent.env (no '-' prefix).
- AuthMiddleware + /ws/live: replace the 172.x source-IP trust (which every
  nginx-proxied internet request satisfied via docker-proxy — full session
  bypass and unauthenticated in-game command injection) with
  private-source AND no X-Forwarded-For, i.e. only genuinely internal
  callers (overlord-agent on the host, compose-network services). Invariant
  documented in nginx/overlord.conf: every tracker-bound location must set
  X-Forwarded-For.
- /character-stats/test endpoints gated behind admin (they upsert real rows).
- docker-compose: bind 5432/5433 to 127.0.0.1 (both DBs were internet-
  reachable; active brute-force observed in dereth-db logs).
- discord-rare-monitor: drop dead SHARED_SECRET constant.
- scripts/backup-databases.sh + docs/backups.md: nightly pg_dump of both DBs
  (telemetry/spawn hypertable data excluded), 10MB canary, umask 077,
  TimescaleDB restore procedure.
- Remove stray mangled-path css file from repo root.

Adversarially reviewed pre-deploy (3-lens workflow): ship verdict; deploy-
sequencing blockers addressed (secret staged before enforcement, exec bit
set, cron uses bash).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Erik 2026-06-10 17:02:47 +02:00
parent c6a1af0c39
commit a28b61511c
12 changed files with 261 additions and 2579 deletions

100
main.py
View file

@ -8,7 +8,9 @@ endpoints for browser clients to retrieve live and historical data, trails, and
from collections import defaultdict
from datetime import datetime, timedelta, timezone
import hmac
import html as _html
import ipaddress
import json
import logging
import os
@ -990,10 +992,25 @@ live_equipment_cantrip_states: Dict[str, dict] = {}
live_nearby_objects: Dict[str, dict] = {}
dungeon_map_cache: Dict[str, dict] = {} # landblock hex string -> dungeon map data
# Shared secret used to authenticate plugin WebSocket connections (override for production)
SHARED_SECRET = "your_shared_secret"
# Secret key for signing session cookies (override via SECRET_KEY env var)
SECRET_KEY = os.getenv("SECRET_KEY", "change-me-in-production-please")
# Shared secret used to authenticate plugin WebSocket connections.
# MUST come from the environment — this repo is public, so a hardcoded value
# is no auth at all. When unset (or left at the old placeholder) we fail
# closed: every plugin connection is refused until it is configured.
SHARED_SECRET = os.getenv("SHARED_SECRET", "")
_SHARED_SECRET_OK = bool(SHARED_SECRET) and SHARED_SECRET != "your_shared_secret"
if not _SHARED_SECRET_OK:
logger.critical(
"SHARED_SECRET env var is unset or still the placeholder — "
"refusing ALL plugin WebSocket connections until it is set in .env"
)
# Secret key for signing session cookies. Fail closed: running with a
# publicly-known default would let anyone forge admin sessions.
SECRET_KEY = os.getenv("SECRET_KEY", "")
if not SECRET_KEY or SECRET_KEY == "change-me-in-production-please":
raise RuntimeError(
"SECRET_KEY env var must be set to a strong random value — "
"session cookies are signed with it"
)
SESSION_MAX_AGE = 30 * 24 * 3600 # 30 days in seconds
_serializer = URLSafeTimedSerializer(SECRET_KEY)
@ -1024,6 +1041,19 @@ _PUBLIC_PATHS = {"/login", "/logout"}
_PUBLIC_PREFIXES = ("/ws/position",) # Plugin WS uses X-Plugin-Secret
def _is_private_addr(host: str) -> bool:
"""True when `host` is a private/loopback address (RFC1918, 127/8, ::1).
Used by the internal-trust rule: a private TCP peer WITHOUT an
X-Forwarded-For header cannot have come through nginx and therefore
cannot originate from the internet.
"""
try:
return ipaddress.ip_address(host).is_private
except ValueError:
return False
class AuthMiddleware(BaseHTTPMiddleware):
"""Redirect unauthenticated requests to /login."""
@ -1046,20 +1076,20 @@ class AuthMiddleware(BaseHTTPMiddleware):
if path.startswith("/ws/live"):
return await call_next(request)
# Trust internal connections (Docker network gateway + loopback). The
# tracker port (8765) is bound to 127.0.0.1 in docker-compose.yml and
# only the host or other compose-network containers can reach it.
# This lets host-side helpers (overlord-agent, discord-rare-monitor,
# etc.) call any endpoint without forging a session cookie.
#
# IMPORTANT: We still try to decode the session cookie if present, so
# that endpoints like /me which check `request.state.user` work for
# real authenticated browsers proxied through nginx → docker-proxy
# (which makes them look like they're coming from 172.x). Without
# this, /me returned 401 even for logged-in users, silently
# disabling the admin-only UI on the dashboard.
# Trust genuinely internal callers only. The tracker port (8765) is
# published on 127.0.0.1, so host-side helpers (overlord-agent) and
# compose-network containers reach it directly — but so does ALL
# external browser traffic, via nginx → docker-proxy, which makes it
# arrive with a 172.x source IP. Source IP alone therefore proves
# nothing. The distinguishing signal is X-Forwarded-For: nginx sets
# it on every proxied request, while direct internal calls have no
# proxy in front of them and lack the header. A request with a
# private source AND no X-Forwarded-For cannot have come through
# nginx, i.e. cannot originate from the internet.
client_host = request.client.host if request.client else ""
if client_host.startswith("172.") or client_host in ("127.0.0.1", "::1", "localhost"):
if _is_private_addr(client_host) and "x-forwarded-for" not in request.headers:
# Still decode the cookie if present so request.state.user works
# for internal tools that do log in.
token = request.cookies.get("session")
if token:
user = verify_session_cookie(token)
@ -2945,9 +2975,13 @@ async def ws_receive_snapshots(
"""
global _plugin_connections
# Authenticate plugin connection using shared secret
key = secret or x_plugin_secret
if key != SHARED_SECRET:
# Authenticate plugin connection using shared secret (constant-time
# compare; refuse everything when the secret is not configured).
key = secret or x_plugin_secret or ""
# compare bytes: compare_digest(str, str) raises TypeError on non-ASCII
if not _SHARED_SECRET_OK or not hmac.compare_digest(
key.encode("utf-8", "replace"), SHARED_SECRET.encode("utf-8")
):
# Reject without completing the WebSocket handshake
logger.warning(
f"Plugin WebSocket authentication failed from {websocket.client}"
@ -3693,11 +3727,16 @@ async def ws_live_updates(websocket: WebSocket):
Manages a set of connected browser clients; listens for incoming command messages
and forwards them to the appropriate plugin client WebSocket.
"""
# Require valid session cookie for browser WebSocket.
# Internal Docker network connections (172.x.x.x) are trusted — this allows
# the Discord bot and other internal services to connect without a cookie.
# Require a valid session cookie for browser WebSockets. Internal
# services (discord-rare-monitor connects over the compose network) are
# identified by a private source IP WITHOUT an X-Forwarded-For header —
# nginx-proxied browser traffic always carries X-Forwarded-For, so an
# internet client can never satisfy this check (same rule as
# AuthMiddleware; see comment there).
client_host = websocket.client.host if websocket.client else ""
is_internal = client_host.startswith("172.") or client_host in ("127.0.0.1", "::1", "localhost")
is_internal = (
_is_private_addr(client_host) and "x-forwarded-for" not in websocket.headers
)
if not is_internal:
token = websocket.cookies.get("session")
if not token or not verify_session_cookie(token):
@ -3865,15 +3904,18 @@ async def get_stats(character_name: str):
@app.post("/character-stats/test")
async def test_character_stats_default():
"""Inject mock character_stats data for frontend development."""
return await test_character_stats("TestCharacter")
async def test_character_stats_default(request: Request):
"""Inject mock character_stats data for frontend development (admin only)."""
_require_admin(request)
return await test_character_stats("TestCharacter", request)
@app.post("/character-stats/test/{name}")
async def test_character_stats(name: str):
async def test_character_stats(name: str, request: Request):
"""Inject mock character_stats data for a specific character name.
Processes through the same pipeline as real plugin data."""
Processes through the same pipeline as real plugin data it OVERWRITES
the real character_stats row for {name}, hence admin-only."""
_require_admin(request)
mock_data = {
"type": "character_stats",
"timestamp": datetime.utcnow().isoformat() + "Z",