No description
Find a file
Erik 1196746dbe fix(agent): SQL parser robust against sqlglot version drift
The query_telemetry_db tool was crashing with AttributeError because
exp.AlterTable doesn't exist in this sqlglot version (renamed to Alter).
Made the deny-class list build defensively via getattr and dropped any
classes that the installed sqlglot doesn't expose.

Also broadened the deny list (Alter, AlterColumn, AlterDatabase, Truncate,
Grant, Revoke, Copy) and made the toplevel allowlist tolerant of missing
classes too. The walk() return shape is also normalized in case sqlglot
versions yield (node, parent, key) tuples vs. bare nodes.

Belt-and-suspenders is fine — the GRANT-SELECT-only PG role is the real
write barrier; the parser is just a faster/friendlier reject path.
2026-04-25 23:07:00 +02:00
agent fix(agent): SQL parser robust against sqlglot version drift 2026-04-25 23:07:00 +02:00
alembic new comments 2025-05-24 18:33:03 +00:00
discord-rare-monitor Add WS message filtering, idle grace period, webhook env var 2026-04-15 00:08:55 +02:00
docs docs: add suitbuilder algorithm documentation 2026-03-07 21:01:25 +00:00
frontend feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
grafana Major overhaul of db -> hypertable conversion, updated GUI, added inventory 2025-06-08 20:51:06 +00:00
inventory-service feat: compute base item values by reversing active spell buffs 2026-04-09 12:31:39 +02:00
nginx feat(agent): cross-char search_items tool + bump timeouts 2026-04-25 21:13:26 +02:00
static feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
.gitignore fix(agent): keep strict permissions server-side, not in repo 2026-04-25 22:26:02 +02:00
.mcp.json fix(agent): point .mcp.json at venv python so MCP deps resolve 2026-04-25 20:45:52 +02:00
AGENTS.md feat: add mana tracker panel to inventory 2026-03-11 20:02:52 +01:00
alembic.ini new comments 2025-05-24 18:33:03 +00:00
CLAUDE.md fix(agent): block Agent + Gmail/Drive/Calendar tools, brief model not to probe 2026-04-25 22:45:39 +02:00
CUserseriknsourcereposdereth-workspacestyle_old.css feat: major cleanup + death alerts + idle detection + Discord webhooks 2026-04-14 16:32:14 +02:00
db.py new comments 2025-05-24 18:33:03 +00:00
db_async.py Server load optimization: spawn_events retention + log spam fix 2026-04-15 10:08:51 +02:00
deploy-frontend.sh feat: major cleanup + death alerts + idle detection + Discord webhooks 2026-04-14 16:32:14 +02:00
docker-compose.yml Fix PostgreSQL shared_buffers (was 96GB on 32GB host) 2026-04-15 10:11:35 +02:00
Dockerfile Remove --reload from production uvicorn CMD 2026-04-14 23:45:43 +02:00
generate_data.py new comments 2025-05-24 18:33:03 +00:00
main.py fix(auth): trust internal Docker/loopback connections in AuthMiddleware 2026-04-25 20:47:47 +02:00
Makefile new comments 2025-05-24 18:33:03 +00:00
README.md docs: README updated for Overlord Agent — host-side service, security stack, deploy flow 2026-04-25 22:50:54 +02:00

Mosswart Overlord (Dereth Tracker)

Real-time telemetry, inventory, and analytics platform for Asheron's Call. FastAPI backend + React frontend + PostgreSQL (TimescaleDB) + Discord integrations, all driven by WebSocket events from the companion MosswartMassacre DECAL plugin.


Table of Contents


Overview

Mosswart Overlord is the backend that consumes a firehose of telemetry, vitals, inventory, combat, and chat events from 60+ characters running the MosswartMassacre plugin. It stores selected data in TimescaleDB, runs analytics (combat stats, idle/death detection), and broadcasts live updates to connected browser clients.

The frontend is a React + Vite app served at / with a live map, draggable windows (inventory, chat, combat, radar, etc.), and a server uptime sidebar. The previous vanilla JS frontend is preserved at /classic.

Architecture

                ┌─────────────────────────┐
                │  MosswartMassacre (C#)  │  ← plugin per game client
                └────────────┬────────────┘
                             │ WebSocket /ws/position (authenticated)
                             ▼
┌────────────────────────────────────────────────────────┐
│ dereth-tracker (FastAPI, Docker)                      │
│  • main.py — WS routing, analytics, broadcasts        │
│  • idle/death detection → Discord webhook             │
│  • combat stats delta/lifetime accumulation           │
│  • vital sharing relay (cross-machine)                │
└──┬──────────────────┬────────────────────┬────────────┘
   │                  │                    │
   │ WS /ws/live      │ HTTP               │ HTTP
   ▼                  ▼                    ▼
┌──────────┐  ┌──────────────────┐  ┌──────────────────┐
│ Browsers │  │ inventory-svc    │  │ Discord bot      │
│ (React)  │  │ (FastAPI, Docker)│  │ (rare monitor)   │
└────┬─────┘  └────────┬─────────┘  └──────────────────┘
     │                 ▼
     │           ┌──────────────┐
     │           │ inventory-db │
     │           └──────────────┘
     │
     │ /api/agent/* (host-side, OUTSIDE Docker)
     ▼
┌────────────────────────────────────────┐
│ overlord-agent (FastAPI, systemd)      │ ← runs as dedicated unprivileged user
│  • shells out to `claude -p ...`       │   /var/lib/overlord-agent home,
│  • MCP server: live-state Q&A tools    │   strict settings, no /home/erik
└────────────────────────────────────────┘

   ┌──────────────┐
   │ dereth-db    │ ← TimescaleDB (telemetry, spawns, rares, portals)
   └──────────────┘

Most services run via Docker Compose. overlord-agent is host-side (systemd) because it shells out to the claude CLI which depends on host-side credentials — see AI Assistant.

Features

Live Data

  • Live Map — real-time player positions, dots, trails, portals, heatmap
  • WebSocket firehose (/ws/live) — broadcasts every incoming event to browsers
  • Per-client subscriptions — clients can send {"type":"subscribe","message_types":[...]} to receive only specific event types (the Discord rare monitor bot uses this to filter the 82GB/day firehose down to just rare and chat)

Inventory

  • Full inventory snapshot on login + incremental inventory_delta updates (add/update/remove)
  • Per-character live refresh in the browser (debounced 2s)
  • Advanced search with filters: material, set, armor level, spells, tinks, workmanship, etc.
  • Suitbuilder at /suitbuilder.html — constraint-based armor optimization across multiple mule inventories with primary/secondary set support, cantrip overlap detection, and real-time SSE streaming

Combat Stats (Mag-Tools style)

  • Plugin parses combat chat into session deltas
  • Backend accumulates lifetime totals from per-session snapshots
  • Offense/defense broken out per damage element
  • Browser combat window shows monster-by-monster damage

Cross-Machine Vital Sharing

  • WebSocket relay replaces UtilityBelt's localhost-only VTankFellowHeals
  • Plugin broadcasts its own vitals and consumes peer vitals
  • In-game DxHud overlay shows peer health/stamina/mana bars with direction arrows

AI Assistant

  • 🤖 chat window in the dashboard backed by claude -p running headless on the server
  • Read-only access to live game state via 12 MCP tools (live players, inventory cross-search, combat stats, quests, suitbuilder, read-only SQL, etc.)
  • Per-browser persistent session, "New Chat" button, history rehydration on reload
  • Hardened: dedicated unprivileged Linux user, systemd lockdown, strict tool whitelist, audit log, rate limit. See AI Assistant section for the full security stack.

Discord Integration

  • Rare Monitor Bot — posts rares (split by common/great) to configured channels
  • Death Alerts — webhook to #alerts when a character's vitae goes from 0 → >0 (rate-limited to one per character per 5 min)
  • Idle Alerts — webhook after 5 minutes of continuous idle state (caught portals, stuck nav, etc.). The grace period prevents false positives on brief idle blips.
  • Vortex Warning — bot watches for "whirlwind of vortexes" chat and posts a warning embed

Portals

  • Automatic discovery + 1-hour retention
  • Coordinate-deduplicated (rounded to 0.1 precision)

Stats

  • Per-character lifetime kills, deaths, rares, taper counts
  • Grafana dashboards (2x2 iframe grid in the stats window)

Health & Monitoring

  • Server uptime + latency + player count from TreeStats.net (checked every 30s)
  • Only current state is kept — no historical server_health_checks table (removed April 2026 as write-only bloat)

Requirements

  • Docker & Docker Compose (recommended)
  • OR: Python 3.11+, Node.js 20+, and a PostgreSQL 14+ with TimescaleDB

Installation

git clone git@git.snakedesert.se:SawatoMosswartsEnjoyersClub/MosswartOverlord.git
cd MosswartOverlord
cp .env.example .env       # fill in secrets (see Configuration below)
docker compose up -d

Frontend development loop

cd frontend
npm install
npm run dev               # local Vite server
# ...edit files, hot reload...
cd ..
bash deploy-frontend.sh   # builds + copies to static/ for production serving

⚠️ npm run build writes to static/_build/ but the web server serves from static/. You must run deploy-frontend.sh to copy _build/ → static/. Otherwise the browser keeps loading the previous bundle.

Configuration

All secrets go in .env:

Variable Purpose
POSTGRES_PASSWORD Telemetry DB password
INVENTORY_DB_PASSWORD Inventory DB password
SHARED_SECRET Plugin auth for /ws/position
SECRET_KEY Session cookie signing
DISCORD_RARE_BOT_TOKEN Bot token for rare monitor
DISCORD_ACLOG_WEBHOOK Webhook URL for death/idle alerts
GF_SECURITY_ADMIN_PASSWORD Grafana admin
COMMON_RARE_CHANNEL_ID Discord channel ID for common rares
GREAT_RARE_CHANNEL_ID Discord channel ID for great rares
ACLOG_CHANNEL_ID Discord channel ID for the rare bot's status/vortex messages
MONITOR_CHARACTER Which character's chat the bot monitors

The Overlord Agent has its own env file at /etc/overlord/agent.env (root:overlord-agent 0640) so it doesn't share the tracker's secrets:

Variable Purpose
SECRET_KEY Same value as the tracker — validates browser session cookies
AGENT_DB_DSN Read-only connection string postgresql://overlord_agent_ro:<pw>@127.0.0.1:5432/dereth
TRACKER_URL Loopback to the tracker container (default http://127.0.0.1:8765)
AGENT_RATE_MAX Per-user rate limit (default 60/hour)
AGENT_RATE_WINDOW_S Rate-limit window in seconds (default 3600)
AGENT_AUDIT_LOG Path to audit JSONL (default /var/log/overlord-agent/audit.jsonl)
CLAUDE_TIMEOUT_S Max seconds per claude -p invocation (default 240)

Deploying Changes

Live backend host: overlord.snakedesert.se (SSH user erik, key-based auth).

Quick deploy — Python / static file changes

ssh erik@overlord.snakedesert.se \
  "cd /home/erik/MosswartOverlord && git pull --ff-only origin master"
# Python changes require a restart:
ssh erik@overlord.snakedesert.se "docker compose restart dereth-tracker"
# Static files (JS/CSS/HTML) are served from the bind-mounted static/ — no restart.

⚠️ Uvicorn runs without --reload in production. Do not add it back — without the watchfiles package it falls back to a polling reloader that busy-loops at 100% CPU and eats a whole core.

React frontend deploy

cd frontend && npm run build && cd ..
bash deploy-frontend.sh
git add static/ && git commit -m "deploy frontend" && git push
ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && git pull"
# No container restart needed.

Full rebuild — Dockerfile / pip package / version stamp changes

ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
  git pull --ff-only origin master && \
  export BUILD_VERSION=\"\$(date -u +%Y.%-m.%-d.%H%M)-\$(git rev-parse --short HEAD)\" && \
  docker compose build --no-cache --build-arg BUILD_VERSION=\$BUILD_VERSION dereth-tracker && \
  docker compose up -d dereth-tracker"

BUILD_VERSION is displayed in the sidebar of the live frontend. Format is CalVer: YYYY.M.D.HHMM-gitshorthash.

Overlord Agent deploy

Code changes to agent/ only:

ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
  git pull --ff-only origin master && \
  sudo systemctl restart overlord-agent"
journalctl -u overlord-agent -f   # tail logs to verify

agent/requirements.txt changed (new pip deps):

ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
  git pull --ff-only origin master && \
  agent/.venv/bin/pip install -r agent/requirements.txt && \
  sudo systemctl restart overlord-agent"

systemd unit changed:

ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
  git pull --ff-only origin master && \
  sudo cp agent/overlord-agent.service /etc/systemd/system/ && \
  sudo systemctl daemon-reload && sudo systemctl restart overlord-agent"

First-time install: bash agent/install.sh — see agent/README.md for the full bootstrap procedure (creating the overlord-agent user, copying claude auth, granting filesystem access, populating /etc/overlord/agent.env).

WebSocket Contract

/ws/position (plugin → backend)

Authenticated via ?secret=<SHARED_SECRET> or X-Plugin-Secret header. Accepts JSON frames with a type discriminator:

type Purpose
telemetry Position, kills, session metrics (every 2s per character)
vitals Health/stamina/mana/vitae percentages
character_stats Full attributes/skills/allegiance (every 10 min)
inventory / full_inventory Complete inventory dump on login
inventory_delta Incremental add/update/remove of a single item
equipment_cantrip_state Equipped spell effects
portal Discovered portal with coordinates
spawn Monster spawn observation
chat In-game chat line (any channel)
quest Quest timer / progress
rare Rare item find notification
nearby_objects On-demand radar data (nearby entities)
combat_stats Session combat snapshot (Mag-Tools parser output)
share_* Cross-machine vital/debuff sharing envelopes
dungeon_map Dungeon floor tile data for radar overlay

See EVENT_FORMATS.json for exact per-type schemas.

/ws/live (browser → backend)

Session-cookie authenticated (except for internal Docker network clients, which are trusted by IP). Clients can:

  • Send {"type":"subscribe","message_types":["rare","chat"]} to filter which events they receive. Without subscribing, all types are forwarded (browser default).
  • Send {"player_name":"Larsson","command":"/radar start"} to route a command to that character's plugin client.
  • Send {"type":"request_dungeon_map","landblock":"..."} to pull cached dungeon tile data.

Backend pushes the same firehose (subject to subscription filter) to every browser client.

HTTP API Reference

See EVENT_FORMATS.json for event schemas. Major HTTP endpoints:

  • GET /live — active players seen in the last 30s
  • GET /history?from=…&to=… — historical telemetry snapshots
  • GET /trails — recent player trails for the map
  • GET /spawns/heatmap?hours=N — aggregated spawn density
  • GET /portals — discovered portals within retention window
  • GET /inventory/{character} — current inventory (proxied to inventory-service)
  • GET /character-stats/{character} — full character attributes/skills
  • GET /combat-stats/{character} — session + lifetime combat stats
  • GET /vital-sharing/peers — currently-registered vital sharing peers
  • GET /api-version — build version stamp
  • GET /server-health — current Coldeve server status + player count

Frontend

React v2 (primary, at /)

  • Map-first layout with draggable/resizable windows
  • Code-split bundles: one chunk per window type, lazy-loaded on open
  • Window types: Chat, Stats, Inventory, Character, Radar, CombatStats, CombatPicker, Issues, VitalSharing, QuestStatus, PlayerDashboard
  • Per-character inventory version counter — an open inventory window refreshes 2s after its own character's last inventory_delta, ignoring unrelated traffic
  • Direct DOM pan/zoom on the map (no React state per frame)
  • Service worker caches a small whitelist of static assets
  • Version badge in the sidebar confirms which build is loaded

Classic v1 (preserved at /classic)

The original vanilla JS frontend with element-pooling optimization is kept for fallback and reference.

AI Assistant (Overlord Agent)

A draggable chat window in the dashboard (🤖 Assistant button). Powered by claude -p running headless on the server, with read-only access to live game state via an MCP server.

Architecture

  • Host-side service (agent/, systemd unit overlord-agent) runs OUTSIDE Docker because the claude CLI binary lives on the host (/home/erik/.local/bin/claude) and depends on host-side authentication credentials.
  • Dedicated UNIX user (overlord-agent, system account, /var/lib/overlord-agent home, no shell) — kernel-level isolation from the operator's erik account. Cannot read /home/erik/.claude, ~/.ssh, .bash_history, .env, etc.
  • MCP stdio server (agent/mcp_overlord.py) exposes 12 tools that wrap the tracker's HTTP endpoints + read-only DB queries. Claude only sees these tools; no Bash, Read, Write, etc.
  • Frontend (AgentWindow.tsx) — per-browser session UUID in localStorage, "New Chat" button, on-mount rehydration from /agent/sessions/{id}/history.

MCP tools available to the assistant

get_live_players, get_player_state, get_combat_stats, get_equipment_cantrips, get_inventory, get_inventory_search, search_items (cross-character), get_recent_rares, get_quest_status, get_server_health, query_telemetry_db (read-only SQL via sqlglot parser + GRANT-SELECT-only PG role), suitbuilder_search. Plus WebFetch(domain:acpedia.org) for AC info lookups.

Security stack (defense-in-depth)

  1. Cookie auth on /agent/ask (same session cookie the tracker issues)
  2. Per-user rate limit (60 req/h default) and concurrency cap (1 in-flight)
  3. JSONL audit log at /var/log/overlord-agent/audit.jsonl (every prompt + result)
  4. CLI flags--allowed-tools (just our 12 MCP tools), --disallowed-tools (Bash, Write, Read, Edit, Agent, ToolSearch, Monitor, scheduling, Gmail/Drive/Calendar, etc.), --permission-mode dontAsk
  5. /var/lib/overlord-agent/.claude/settings.json — strict deny rules (server-side only, NOT in repo)
  6. System-prompt scope rules in CLAUDE.md — instruct the model not to probe, not to suggest workarounds
  7. SQL parser (sqlglot) rejects any non-SELECT statement on query_telemetry_db
  8. Read-only PG role overlord_agent_ro (GRANT SELECT only) — even a parser bypass can't mutate
  9. systemd hardeningProtectSystem=strict, ProtectHome=read-only, InaccessiblePaths=/etc/shadow,/root,~/.ssh,…, NoNewPrivileges=true, CapabilityBoundingSet= (empty), PrivateTmp=true, PrivateDevices=true, RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6, SystemCallFilter=@system-service ~@privileged ~@reboot ~@mount, MemoryMax=512M, TasksMax=128
  10. Secrets out of /home/etc/overlord/agent.env (root:overlord-agent 0640) for SECRET_KEY + AGENT_DB_DSN

Files

Path What
agent/service.py FastAPI app: /agent/health, /agent/sessions/new, /agent/ask, /agent/sessions/{id}/history
agent/auth.py Session cookie validation (mirrors main.py:1013-1019)
agent/claude_wrapper.py asyncio.create_subprocess_exec("claude", "-p", …) with allowed/disallowed-tools
agent/tools.py Pure tool implementations
agent/mcp_overlord.py MCP stdio server registering tools
agent/sql/0001_overlord_agent_ro.sql Read-only PG role
agent/overlord-agent.service systemd unit (the hardening directives)
agent/install.sh venv + systemd setup
agent/README.md Operator's deeper reference
.mcp.json (repo root) Project-level MCP config Claude Code auto-loads
CLAUDE.md "Overlord Assistant Mode" section System-prompt briefing

Routing

nginx forwards /api/agent/* to 127.0.0.1:8767 (the host-side service) with a 300s read/send timeout (suitbuilder runs can be slow). Other /api/* continues to the dereth-tracker container at 127.0.0.1:8765.

Cost / quota

Subscription auth (no API key); per-call cost is informational only. Each /agent/ask invocation = one claude -p subprocess with shared session cache. Reactive only — no background polling, no scheduled tasks.

Database Schema

Telemetry DB (dereth, TimescaleDB)

Table Type Retention Purpose
telemetry_events hypertable 30 days Position/stats snapshots
spawn_events hypertable 7 days Monster spawn observations (heatmap source)
rare_events regular forever Rare find history
portals regular 1 hour Discovered portals, dedup by rounded coords
char_stats regular forever Per-character lifetime kill total
rare_stats regular forever Per-character lifetime rare total
rare_stats_sessions regular forever Per-session rare count
combat_stats regular forever Lifetime combat accumulator
combat_stats_sessions regular forever Per-session combat snapshots
character_stats regular forever Latest full stats JSON per character
server_status regular forever Current Coldeve server state (single row)

Inventory DB (inventory_db, PostgreSQL)

Normalized schema: items, item_combat_stats, item_requirements, item_enhancements, item_ratings, item_spells, item_raw_data.

items.container_id stores the in-game ID of the container holding the item (0 = character body). The frontend groups items into packs by this ID.

Operations & Health

PostgreSQL tuning

dereth-db runs with explicit memory overrides in docker-compose.yml:

  • shared_buffers=8GB (was 96GB via auto-tune on a 32GB host — caused thrashing)
  • effective_cache_size=16GB
  • work_mem=16MB, maintenance_work_mem=1GB
  • max_wal_size=4GB

Retention policies

  • telemetry_events: 30-day drop, daily
  • spawn_events: 7-day drop, daily
  • portals: 1-hour cleanup (background task in main.py)
  • server_health_checks: removed — was write-only, 850K rows of nothing

Log levels

Both dereth-tracker and inventory-service run at LOG_LEVEL=INFO. Do not set to DEBUG in production — it dumps full inventory_delta payloads for every item update (hundreds of KB/sec).

Host (Proxmox VM)

  • 6 vCPU, 32 GiB RAM (of which ~30 GiB is normally free under current load)
  • Live host: overlord.snakedesert.se
  • Reverse proxy: Nginx on the host terminates TLS and strips the /api/ prefix before forwarding to port 8765

Debug commands

docker ps
docker logs mosswartoverlord-dereth-tracker-1 --tail 100
docker logs mosswartoverlord-inventory-service-1 --tail 100
docker logs mosswartoverlord-discord-rare-monitor-1 --tail 100
docker exec dereth-db psql -U postgres -d dereth

Contributing

Contributions welcome. Please:

  • Keep cross-repo protocol changes additive (new optional fields > renames/removes)
  • Update both this README and CLAUDE.md when workflows change
  • Test end-to-end: plugin → backend → browser for any new event type

For detailed architecture notes and ongoing investigations, see CLAUDE.md and docs/plans/.