docs: Go is production — rewrite README, update CLAUDE.md, gitignore .env
- README: Go-backend architecture, build/run via the compose override stack, WS/payload/auth/DB contracts, the branch layout (master = Go, python-legacy). - CLAUDE.md: Project Overview + Components reflect the Go services; a "Go services — build, deploy, gotchas" section (string coercion, typeless telemetry, the trinket dedup, rollback); Deploying + Suitbuilder point at the Go paths. The behavioral contracts (WS/auth/DB/routes) are kept — Go honors them; file refs to main.py/inventory-service mark the legacy source. - .gitignore: ignore .env / .env.bak-* (public repo; .env.example stays tracked). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
5ade47dc64
commit
9911edbfa8
3 changed files with 172 additions and 414 deletions
6
.gitignore
vendored
6
.gitignore
vendored
|
|
@ -3,6 +3,12 @@ __pycache__
|
|||
static/v2/
|
||||
frontend/node_modules/
|
||||
|
||||
# Secrets — the server-side env files hold SHARED_SECRET, SECRET_KEY, DB
|
||||
# passwords, and the Discord token. This repo is PUBLIC — never commit them.
|
||||
# .env.example stays tracked as the template.
|
||||
.env
|
||||
.env.bak-*
|
||||
|
||||
# Claude Code config — never commit. The production agent's strict
|
||||
# permissions live server-side at /var/lib/overlord-agent/.claude/
|
||||
# (and via CLI flags in agent/claude_wrapper.py). The repo stays
|
||||
|
|
|
|||
43
CLAUDE.md
43
CLAUDE.md
|
|
@ -5,22 +5,41 @@ Cross-repo workflows (plugin coupling, deploy commands, nginx) live in the works
|
|||
|
||||
## Project Overview
|
||||
|
||||
Dereth Tracker is a real-time telemetry platform for Asheron's Call world tracking. A FastAPI WebSocket/HTTP service (`main.py`, single file ~4200 lines) ingests player data from the MosswartMassacre DECAL plugin and serves a live React dashboard, with TimescaleDB persistence, a separate inventory microservice, Grafana dashboards, a Discord rare bot, and a host-side Claude-powered assistant.
|
||||
Dereth Tracker is a real-time telemetry platform for Asheron's Call world tracking. **The production backend is Go** (`go-services/`): a tracker service (`tracker-go/`) ingests player data from the MosswartMassacre DECAL plugin over `/ws/position`, serves the React dashboard + login/admin + the read API, and writes TimescaleDB; an inventory service (`inventory-go/`) handles item search, the suitbuilder solver, and inventory ingestion. Plus Grafana, a (Python) Discord rare bot, and a host-side Claude-powered assistant.
|
||||
|
||||
The original Python/FastAPI implementation (`main.py` ~4200 lines, `inventory-service/`) is preserved on the **`python-legacy`** branch; the Go services were validated byte-identical against it in a parallel "strangler-fig" run, then production was cut over. ⚠ **The behavioral contracts below (WS, auth, DB, routes, suitbuilder) describe what Go honors. Where they cite `main.py` / `inventory-service/`, that's the legacy source that defined the contract — the live implementation is the corresponding Go handler.**
|
||||
|
||||
## Components
|
||||
|
||||
| Component | Where | Runs as |
|
||||
|---|---|---|
|
||||
| Tracker API (`main.py`) | repo root | Docker `dereth-tracker`, 127.0.0.1:8765 |
|
||||
| Telemetry DB (TimescaleDB) | `db_async.py` schema | Docker `dereth-db`, port 5432 |
|
||||
| Inventory service + DB | `inventory-service/` | Docker `inventory-service` (127.0.0.1:8766) + `inventory-db` (5433) |
|
||||
| React frontend | `frontend/` → built into `static/` | served by tracker (FastAPI StaticFiles) |
|
||||
| Classic v1 frontend | `static/classic/` | served at `/classic` |
|
||||
| Legacy vanilla pages | `static/inventory.html`, `static/suitbuilder.html` | still live |
|
||||
| **Tracker** (ingest + website + read API + WS) | `go-services/tracker-go/` | Docker `dereth-tracker-go`, 127.0.0.1:8770 |
|
||||
| **Inventory** (search + suitbuilder + ingestion) | `go-services/inventory-go/` | Docker `inventory-go`, 127.0.0.1:8772 |
|
||||
| Telemetry DB (TimescaleDB) | schema in `tracker-go/schema.go` (replica of legacy `db_async.py`) | Docker `dereth-db`, port 5432 |
|
||||
| Inventory DB | schema in `inventory-go/schema.go` | Docker `inventory-db`, 5433 |
|
||||
| React frontend | `frontend/` → built into `static/` | served by `tracker-go` (static file server, SPA fallback) |
|
||||
| Classic v1 / legacy pages | `static/classic/`, `static/*.html` | served by `tracker-go` |
|
||||
| Grafana | compose service `dereth-grafana` | 127.0.0.1:3000, anonymous Viewer auth, proxied at `/grafana/` |
|
||||
| Discord rare bot | `discord-rare-monitor/` | Docker, connects to `/ws/live` internally |
|
||||
| Discord rare bot | `discord-rare-monitor/` (Python) | Docker, reads the Go `/ws/live` |
|
||||
| Overlord Agent (assistant) | `agent/` | **host-side systemd service** `overlord-agent`, 127.0.0.1:8767 |
|
||||
|
||||
### Go services — build, deploy, gotchas
|
||||
|
||||
- **Build on the server, no host Go needed** (multi-stage distroless images). Go 1.25, `pgx/v5`, `coder/websocket`, `bwmarrin/discordgo`, `x/crypto/bcrypt`. Sync + build + recreate:
|
||||
```bash
|
||||
tar czf - go-services | ssh erik@overlord.snakedesert.se "tar xzf - -C /home/erik/MosswartOverlord/"
|
||||
ssh erik@overlord.snakedesert.se 'cd /home/erik/MosswartOverlord && \
|
||||
export BUILD_VERSION="$(date -u +%Y.%-m.%-d.%H%M)-$(git rev-parse --short HEAD)" && \
|
||||
docker compose -f docker-compose.yml -f go-services/docker-compose.go.yml build dereth-tracker-go inventory-go && \
|
||||
docker compose -f docker-compose.yml -f go-services/docker-compose.go.yml -f go-services/docker-compose.cutover.yml \
|
||||
up -d --no-deps dereth-tracker-go inventory-go'
|
||||
```
|
||||
- **`docker-compose.cutover.yml`** is what makes the Go services production: `READ_ONLY=false` (write the prod DBs), `SKIP_SCHEMA_INIT=true` (trust the existing schema, run NO DDL), `SHARED_SECRET`/`DISCORD_ACLOG_WEBHOOK` for the tracker, and the Discord bot repointed at `ws://dereth-tracker-go:8770/ws/live`. Drop it to revert to read-only parallel mode.
|
||||
- **Rollback** = `docker compose ... up -d` WITHOUT the cutover override (Go → read-only) + start the Python `dereth-tracker`/`inventory-service` + revert the nginx `http://tracker_go/` lines to `http://tracker/`.
|
||||
- ⚠ **Plugin sends some numeric fields as STRINGS** (`kills_per_hour`, `deaths`, `total_deaths`, `prismatic_taper_count`). Go coerces via `coerceNum` (`tracker-go/reads.go`) — pydantic did this implicitly; a plain number cast would write null/0.
|
||||
- ⚠ **Telemetry must be broadcast TYPELESS** to `/ws/live` (`stripType` in `tracker-go/ingest.go`). The browser ignores typeless messages and uses the 5 s `/live` poll for player data; broadcasting telemetry WITH a type makes the UI overwrite the /live-derived counters and flap them 0↔value.
|
||||
- ⚠ `inventory-go` `slot_names=Trinket` must exclude `%bracelet%` or bracelets duplicate the Wrist buckets in the suitbuilder.
|
||||
|
||||
## WebSocket endpoints
|
||||
|
||||
- `/ws/position` — plugin ingest (telemetry, inventory, portal, rare, combat, share_*, …). Authenticated by `X-Plugin-Secret` header against the `SHARED_SECRET` env var; fails closed (refuses all plugins) when unset or left at the old placeholder. Constant-time compare.
|
||||
|
|
@ -63,12 +82,14 @@ Dereth Tracker is a real-time telemetry platform for Asheron's Call world tracki
|
|||
|
||||
## Suitbuilder
|
||||
|
||||
Production equipment-optimization engine (`inventory-service/suitbuilder.py`): multi-character search, armor set constraints, cantrip overlap detection, SSE streaming. UI at `/suitbuilder.html`. Architecture doc: `docs/plans/2026-02-09-suitbuilder-architecture.md`.
|
||||
Known limitations: no slot-aware spell filtering, equal spell weighting. The legacy `/optimize/*` solver in inventory-service/main.py is a near-duplicate — `suitbuilder.py` is the production path.
|
||||
Production equipment-optimization engine, ported to Go in `inventory-go/suit_*.go` (constraint-satisfaction DFS: multi-character search, armor set constraints, cantrip overlap, SSE streaming) — validated byte-identical against the legacy `inventory-service/suitbuilder.py`. Live endpoint: `POST /suitbuilder/search` (the tracker proxies `/inv/suitbuilder/search`); the `/optimize/*` solver in the legacy `inventory-service/main.py` was a near-duplicate and is NOT the live path. UI at `/suitbuilder.html`. Known limitations: no slot-aware spell filtering, equal spell weighting.
|
||||
|
||||
## Deploying
|
||||
|
||||
See workspace `../CLAUDE.md` "Build & Deploy Instructions" — quick deploy (git pull + `docker compose restart dereth-tracker` for Python; nothing for static), `deploy-frontend.sh` for React, full `--no-cache` rebuild only for Dockerfile/pip/version-stamp changes. Bind mounts: `main.py`, `db_async.py`, `static/`, `alembic/` only.
|
||||
- **Go backend changes** → see "Go services — build, deploy, gotchas" above (sync `go-services/`, build, recreate with the cutover override). `BUILD_VERSION` (CalVer `YYYY.M.D.HHMM-gitshorthash`) shows in the frontend sidebar.
|
||||
- **Frontend** → `bash deploy-frontend.sh` (complete build+copy into `static/`); the tracker serves `static/` from a bind mount, no restart needed.
|
||||
- **Overlord Agent** → unchanged (host-side Python systemd): `git pull && sudo systemctl restart overlord-agent`.
|
||||
- `README.md` has the full build/run reference. The legacy Python deploy lives on the `python-legacy` branch.
|
||||
|
||||
## Operational notes
|
||||
|
||||
|
|
|
|||
537
README.md
537
README.md
|
|
@ -1,424 +1,155 @@
|
|||
# Mosswart Overlord (Dereth Tracker)
|
||||
|
||||
Real-time telemetry, inventory, and analytics platform for Asheron's Call.
|
||||
FastAPI backend + React frontend + PostgreSQL (TimescaleDB) + Discord integrations,
|
||||
all driven by WebSocket events from the companion [MosswartMassacre](https://github.com/SawatoMosswartsEnjoyersClub/MosswartMassacre) DECAL plugin.
|
||||
Real-time telemetry, inventory, and analytics platform for Asheron's Call —
|
||||
driven by a firehose of WebSocket events from the companion
|
||||
[MosswartMassacre](https://github.com/SawatoMosswartsEnjoyersClub/MosswartMassacre)
|
||||
DECAL plugin running on 60+ characters.
|
||||
|
||||
**The production backend is written in Go** (`go-services/`). It replaced the
|
||||
original Python/FastAPI implementation via a strangler-fig migration: the Go
|
||||
services ran in parallel against live traffic until every endpoint was proven
|
||||
byte-identical, then production was cut over. The Python implementation is
|
||||
preserved on the `python-legacy` branch.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
- [Overview](#overview)
|
||||
- [Architecture](#architecture)
|
||||
- [Features](#features)
|
||||
- [Requirements](#requirements)
|
||||
- [Installation](#installation)
|
||||
- [Configuration](#configuration)
|
||||
- [Deploying Changes](#deploying-changes)
|
||||
- [WebSocket Contract](#websocket-contract)
|
||||
- [HTTP API Reference](#http-api-reference)
|
||||
- [Frontend](#frontend)
|
||||
- [AI Assistant (Overlord Agent)](#ai-assistant-overlord-agent)
|
||||
- [Database Schema](#database-schema)
|
||||
- [Operations & Health](#operations--health)
|
||||
- [Contributing](#contributing)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Mosswart Overlord is the backend that consumes a firehose of telemetry, vitals, inventory, combat, and chat events from 60+ characters running the `MosswartMassacre` plugin. It stores selected data in TimescaleDB, runs analytics (combat stats, idle/death detection), and broadcasts live updates to connected browser clients.
|
||||
|
||||
The frontend is a React + Vite app served at `/` with a live map, draggable windows (inventory, chat, combat, radar, etc.), and a server uptime sidebar. The previous vanilla JS frontend is preserved at `/classic`.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────┐
|
||||
│ MosswartMassacre (C#) │ ← plugin per game client
|
||||
└────────────┬────────────┘
|
||||
│ WebSocket /ws/position (authenticated)
|
||||
▼
|
||||
┌────────────────────────────────────────────────────────┐
|
||||
│ dereth-tracker (FastAPI, Docker) │
|
||||
│ • main.py — WS routing, analytics, broadcasts │
|
||||
│ • idle/death detection → Discord webhook │
|
||||
│ • combat stats delta/lifetime accumulation │
|
||||
│ • vital sharing relay (cross-machine) │
|
||||
└──┬──────────────────┬────────────────────┬────────────┘
|
||||
│ │ │
|
||||
│ WS /ws/live │ HTTP │ HTTP
|
||||
▼ ▼ ▼
|
||||
┌──────────┐ ┌──────────────────┐ ┌──────────────────┐
|
||||
│ Browsers │ │ inventory-svc │ │ Discord bot │
|
||||
│ (React) │ │ (FastAPI, Docker)│ │ (rare monitor) │
|
||||
└────┬─────┘ └────────┬─────────┘ └──────────────────┘
|
||||
│ ▼
|
||||
│ ┌──────────────┐
|
||||
│ │ inventory-db │
|
||||
│ └──────────────┘
|
||||
│
|
||||
│ /api/agent/* (host-side, OUTSIDE Docker)
|
||||
▼
|
||||
┌────────────────────────────────────────┐
|
||||
│ overlord-agent (FastAPI, systemd) │ ← runs as dedicated unprivileged user
|
||||
│ • shells out to `claude -p ...` │ /var/lib/overlord-agent home,
|
||||
│ • MCP server: live-state Q&A tools │ strict settings, no /home/erik
|
||||
└────────────────────────────────────────┘
|
||||
|
||||
┌──────────────┐
|
||||
│ dereth-db │ ← TimescaleDB (telemetry, spawns, rares, portals)
|
||||
└──────────────┘
|
||||
MosswartMassacre plugin ──wss──> nginx ──> Go tracker (tracker-go) ──> dereth (TimescaleDB)
|
||||
(60+ game clients) │ │
|
||||
│ ├──HTTP──> Go inventory (inventory-go) ──> inventory_db
|
||||
Browsers ──https──────────────────> nginx │
|
||||
│ └──/ws/live──> Discord rare bot (relays rares + chat)
|
||||
└──> Grafana (/grafana/) death/idle alerts → Discord webhook
|
||||
```
|
||||
|
||||
Most services run via Docker Compose. **`overlord-agent` is host-side**
|
||||
(systemd) because it shells out to the `claude` CLI which depends on
|
||||
host-side credentials — see [AI Assistant](#ai-assistant-overlord-agent).
|
||||
|
||||
## Features
|
||||
|
||||
### Live Data
|
||||
- **Live Map** — real-time player positions, dots, trails, portals, heatmap
|
||||
- **WebSocket firehose** (`/ws/live`) — broadcasts every incoming event to browsers
|
||||
- **Per-client subscriptions** — clients can send `{"type":"subscribe","message_types":[...]}` to receive only specific event types (the Discord rare monitor bot uses this to filter the 82GB/day firehose down to just `rare` and `chat`)
|
||||
|
||||
### Inventory
|
||||
- Full inventory snapshot on login + incremental `inventory_delta` updates (add/update/remove)
|
||||
- Per-character live refresh in the browser (debounced 2s)
|
||||
- Advanced search with filters: material, set, armor level, spells, tinks, workmanship, etc.
|
||||
- **Suitbuilder** at `/suitbuilder.html` — constraint-based armor optimization across multiple mule inventories with primary/secondary set support, cantrip overlap detection, and real-time SSE streaming
|
||||
|
||||
### Combat Stats (Mag-Tools style)
|
||||
- Plugin parses combat chat into session deltas
|
||||
- Backend accumulates lifetime totals from per-session snapshots
|
||||
- Offense/defense broken out per damage element
|
||||
- Browser combat window shows monster-by-monster damage
|
||||
|
||||
### Cross-Machine Vital Sharing
|
||||
- WebSocket relay replaces UtilityBelt's localhost-only `VTankFellowHeals`
|
||||
- Plugin broadcasts its own vitals and consumes peer vitals
|
||||
- In-game `DxHud` overlay shows peer health/stamina/mana bars with direction arrows
|
||||
|
||||
### AI Assistant
|
||||
- 🤖 chat window in the dashboard backed by `claude -p` running headless on the server
|
||||
- Read-only access to live game state via 12 MCP tools (live players, inventory cross-search, combat stats, quests, suitbuilder, read-only SQL, etc.)
|
||||
- Per-browser persistent session, "New Chat" button, history rehydration on reload
|
||||
- Hardened: dedicated unprivileged Linux user, systemd lockdown, strict tool whitelist, audit log, rate limit. See [AI Assistant section](#ai-assistant-overlord-agent) for the full security stack.
|
||||
|
||||
### Discord Integration
|
||||
- **Rare Monitor Bot** — posts rares (split by common/great) to configured channels
|
||||
- **Death Alerts** — webhook to `#alerts` when a character's vitae goes from 0 → >0 (rate-limited to one per character per 5 min)
|
||||
- **Idle Alerts** — webhook after 5 minutes of continuous idle state (caught portals, stuck nav, etc.). The grace period prevents false positives on brief idle blips.
|
||||
- **Vortex Warning** — bot watches for "whirlwind of vortexes" chat and posts a warning embed
|
||||
|
||||
### Portals
|
||||
- Automatic discovery + 1-hour retention
|
||||
- Coordinate-deduplicated (rounded to 0.1 precision)
|
||||
|
||||
### Stats
|
||||
- Per-character lifetime kills, deaths, rares, taper counts
|
||||
- Grafana dashboards (2x2 iframe grid in the stats window)
|
||||
|
||||
### Health & Monitoring
|
||||
- Server uptime + latency + player count from TreeStats.net (checked every 30s)
|
||||
- Only current state is kept — no historical `server_health_checks` table (removed April 2026 as write-only bloat)
|
||||
|
||||
## Requirements
|
||||
|
||||
- Docker & Docker Compose (recommended)
|
||||
- OR: Python 3.11+, Node.js 20+, and a PostgreSQL 14+ with TimescaleDB
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
git clone git@git.snakedesert.se:SawatoMosswartsEnjoyersClub/MosswartOverlord.git
|
||||
cd MosswartOverlord
|
||||
cp .env.example .env # fill in secrets (see Configuration below)
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### Frontend development loop
|
||||
|
||||
```bash
|
||||
cd frontend
|
||||
npm install
|
||||
npm run dev # local Vite server
|
||||
# ...edit files, hot reload...
|
||||
cd ..
|
||||
bash deploy-frontend.sh # builds + copies to static/ for production serving
|
||||
```
|
||||
|
||||
⚠️ **`npm run build` writes to `static/_build/` but the web server serves from `static/`.** You must run `deploy-frontend.sh` to copy `_build/ → static/`. Otherwise the browser keeps loading the previous bundle.
|
||||
|
||||
## Configuration
|
||||
|
||||
All secrets go in `.env`:
|
||||
|
||||
| Variable | Purpose |
|
||||
|---|---|
|
||||
| `POSTGRES_PASSWORD` | Telemetry DB password |
|
||||
| `INVENTORY_DB_PASSWORD` | Inventory DB password |
|
||||
| `SHARED_SECRET` | Plugin auth for `/ws/position` |
|
||||
| `SECRET_KEY` | Session cookie signing |
|
||||
| `DISCORD_RARE_BOT_TOKEN` | Bot token for rare monitor |
|
||||
| `DISCORD_ACLOG_WEBHOOK` | Webhook URL for death/idle alerts |
|
||||
| `GF_SECURITY_ADMIN_PASSWORD` | Grafana admin |
|
||||
| `COMMON_RARE_CHANNEL_ID` | Discord channel ID for common rares |
|
||||
| `GREAT_RARE_CHANNEL_ID` | Discord channel ID for great rares |
|
||||
| `ACLOG_CHANNEL_ID` | Discord channel ID for the rare bot's status/vortex messages |
|
||||
| `MONITOR_CHARACTER` | Which character's chat the bot monitors |
|
||||
|
||||
The Overlord Agent has its own env file at `/etc/overlord/agent.env` (root:overlord-agent 0640) so it doesn't share the tracker's secrets:
|
||||
|
||||
| Variable | Purpose |
|
||||
|---|---|
|
||||
| `SECRET_KEY` | Same value as the tracker — validates browser session cookies |
|
||||
| `AGENT_DB_DSN` | Read-only connection string `postgresql://overlord_agent_ro:<pw>@127.0.0.1:5432/dereth` |
|
||||
| `TRACKER_URL` | Loopback to the tracker container (default `http://127.0.0.1:8765`) |
|
||||
| `AGENT_RATE_MAX` | Per-user rate limit (default 60/hour) |
|
||||
| `AGENT_RATE_WINDOW_S` | Rate-limit window in seconds (default 3600) |
|
||||
| `AGENT_AUDIT_LOG` | Path to audit JSONL (default `/var/log/overlord-agent/audit.jsonl`) |
|
||||
| `CLAUDE_TIMEOUT_S` | Max seconds per `claude -p` invocation (default 240) |
|
||||
|
||||
## Deploying Changes
|
||||
|
||||
Live backend host: `overlord.snakedesert.se` (SSH user `erik`, key-based auth).
|
||||
|
||||
### Quick deploy — Python / static file changes
|
||||
|
||||
```bash
|
||||
ssh erik@overlord.snakedesert.se \
|
||||
"cd /home/erik/MosswartOverlord && git pull --ff-only origin master"
|
||||
# Python changes require a restart:
|
||||
ssh erik@overlord.snakedesert.se "docker compose restart dereth-tracker"
|
||||
# Static files (JS/CSS/HTML) are served from the bind-mounted static/ — no restart.
|
||||
```
|
||||
|
||||
⚠️ Uvicorn runs **without** `--reload` in production. Do not add it back — without the `watchfiles` package it falls back to a polling reloader that busy-loops at 100% CPU and eats a whole core.
|
||||
|
||||
### React frontend deploy
|
||||
|
||||
```bash
|
||||
cd frontend && npm run build && cd ..
|
||||
bash deploy-frontend.sh
|
||||
git add static/ && git commit -m "deploy frontend" && git push
|
||||
ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && git pull"
|
||||
# No container restart needed.
|
||||
```
|
||||
|
||||
### Full rebuild — Dockerfile / pip package / version stamp changes
|
||||
|
||||
```bash
|
||||
ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
|
||||
git pull --ff-only origin master && \
|
||||
export BUILD_VERSION=\"\$(date -u +%Y.%-m.%-d.%H%M)-\$(git rev-parse --short HEAD)\" && \
|
||||
docker compose build --no-cache --build-arg BUILD_VERSION=\$BUILD_VERSION dereth-tracker && \
|
||||
docker compose up -d dereth-tracker"
|
||||
```
|
||||
|
||||
`BUILD_VERSION` is displayed in the sidebar of the live frontend. Format is CalVer: `YYYY.M.D.HHMM-gitshorthash`.
|
||||
|
||||
### Overlord Agent deploy
|
||||
|
||||
Code changes to `agent/` only:
|
||||
```bash
|
||||
ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
|
||||
git pull --ff-only origin master && \
|
||||
sudo systemctl restart overlord-agent"
|
||||
journalctl -u overlord-agent -f # tail logs to verify
|
||||
```
|
||||
|
||||
`agent/requirements.txt` changed (new pip deps):
|
||||
```bash
|
||||
ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
|
||||
git pull --ff-only origin master && \
|
||||
agent/.venv/bin/pip install -r agent/requirements.txt && \
|
||||
sudo systemctl restart overlord-agent"
|
||||
```
|
||||
|
||||
systemd unit changed:
|
||||
```bash
|
||||
ssh erik@overlord.snakedesert.se "cd /home/erik/MosswartOverlord && \
|
||||
git pull --ff-only origin master && \
|
||||
sudo cp agent/overlord-agent.service /etc/systemd/system/ && \
|
||||
sudo systemctl daemon-reload && sudo systemctl restart overlord-agent"
|
||||
```
|
||||
|
||||
First-time install: `bash agent/install.sh` — see `agent/README.md` for the full bootstrap procedure (creating the `overlord-agent` user, copying claude auth, granting filesystem access, populating `/etc/overlord/agent.env`).
|
||||
|
||||
## WebSocket Contract
|
||||
|
||||
### `/ws/position` (plugin → backend)
|
||||
|
||||
Authenticated via `?secret=<SHARED_SECRET>` or `X-Plugin-Secret` header. Accepts JSON frames with a `type` discriminator:
|
||||
|
||||
| `type` | Purpose |
|
||||
|---|---|
|
||||
| `telemetry` | Position, kills, session metrics (every 2s per character) |
|
||||
| `vitals` | Health/stamina/mana/vitae percentages |
|
||||
| `character_stats` | Full attributes/skills/allegiance (every 10 min) |
|
||||
| `inventory` / `full_inventory` | Complete inventory dump on login |
|
||||
| `inventory_delta` | Incremental add/update/remove of a single item |
|
||||
| `equipment_cantrip_state` | Equipped spell effects |
|
||||
| `portal` | Discovered portal with coordinates |
|
||||
| `spawn` | Monster spawn observation |
|
||||
| `chat` | In-game chat line (any channel) |
|
||||
| `quest` | Quest timer / progress |
|
||||
| `rare` | Rare item find notification |
|
||||
| `nearby_objects` | On-demand radar data (nearby entities) |
|
||||
| `combat_stats` | Session combat snapshot (Mag-Tools parser output) |
|
||||
| `share_*` | Cross-machine vital/debuff sharing envelopes |
|
||||
| `dungeon_map` | Dungeon floor tile data for radar overlay |
|
||||
|
||||
See `EVENT_FORMATS.json` for exact per-type schemas.
|
||||
|
||||
### `/ws/live` (browser → backend)
|
||||
|
||||
Session-cookie authenticated (except for internal Docker network clients, which are trusted by IP). Clients can:
|
||||
|
||||
- Send `{"type":"subscribe","message_types":["rare","chat"]}` to filter which events they receive. Without subscribing, all types are forwarded (browser default).
|
||||
- Send `{"player_name":"Larsson","command":"/radar start"}` to route a command to that character's plugin client.
|
||||
- Send `{"type":"request_dungeon_map","landblock":"..."}` to pull cached dungeon tile data.
|
||||
|
||||
Backend pushes the same firehose (subject to subscription filter) to every browser client.
|
||||
|
||||
## HTTP API Reference
|
||||
|
||||
See `EVENT_FORMATS.json` for event schemas. Major HTTP endpoints:
|
||||
|
||||
- `GET /live` — active players seen in the last 30s
|
||||
- `GET /history?from=…&to=…` — historical telemetry snapshots
|
||||
- `GET /trails` — recent player trails for the map
|
||||
- `GET /spawns/heatmap?hours=N` — aggregated spawn density
|
||||
- `GET /portals` — discovered portals within retention window
|
||||
- `GET /inventory/{character}` — current inventory (proxied to inventory-service)
|
||||
- `GET /character-stats/{character}` — full character attributes/skills
|
||||
- `GET /combat-stats/{character}` — session + lifetime combat stats
|
||||
- `GET /vital-sharing/peers` — currently-registered vital sharing peers
|
||||
- `GET /api-version` — build version stamp
|
||||
- `GET /server-health` — current Coldeve server status + player count
|
||||
|
||||
## Frontend
|
||||
|
||||
### React v2 (primary, at `/`)
|
||||
- Map-first layout with draggable/resizable windows
|
||||
- Code-split bundles: one chunk per window type, lazy-loaded on open
|
||||
- Window types: Chat, Stats, Inventory, Character, Radar, CombatStats, CombatPicker, Issues, VitalSharing, QuestStatus, PlayerDashboard
|
||||
- Per-character inventory version counter — an open inventory window refreshes 2s after its own character's last `inventory_delta`, ignoring unrelated traffic
|
||||
- Direct DOM pan/zoom on the map (no React state per frame)
|
||||
- Service worker caches a small whitelist of static assets
|
||||
- Version badge in the sidebar confirms which build is loaded
|
||||
|
||||
### Classic v1 (preserved at `/classic`)
|
||||
The original vanilla JS frontend with element-pooling optimization is kept for fallback and reference.
|
||||
|
||||
## AI Assistant (Overlord Agent)
|
||||
|
||||
A draggable chat window in the dashboard (🤖 Assistant button). Powered by `claude -p` running headless on the server, with read-only access to live game state via an MCP server.
|
||||
|
||||
### Architecture
|
||||
- **Host-side service** (`agent/`, systemd unit `overlord-agent`) runs OUTSIDE Docker because the `claude` CLI binary lives on the host (`/home/erik/.local/bin/claude`) and depends on host-side authentication credentials.
|
||||
- **Dedicated UNIX user** (`overlord-agent`, system account, `/var/lib/overlord-agent` home, no shell) — kernel-level isolation from the operator's `erik` account. Cannot read `/home/erik/.claude`, `~/.ssh`, `.bash_history`, `.env`, etc.
|
||||
- **MCP stdio server** (`agent/mcp_overlord.py`) exposes 12 tools that wrap the tracker's HTTP endpoints + read-only DB queries. Claude only sees these tools; no `Bash`, `Read`, `Write`, etc.
|
||||
- **Frontend** (`AgentWindow.tsx`) — per-browser session UUID in localStorage, "New Chat" button, on-mount rehydration from `/agent/sessions/{id}/history`.
|
||||
|
||||
### MCP tools available to the assistant
|
||||
`get_live_players`, `get_player_state`, `get_combat_stats`, `get_equipment_cantrips`, `get_inventory`, `get_inventory_search`, `search_items` (cross-character), `get_recent_rares`, `get_quest_status`, `get_server_health`, `query_telemetry_db` (read-only SQL via sqlglot parser + GRANT-SELECT-only PG role), `suitbuilder_search`. Plus `WebFetch(domain:acpedia.org)` for AC info lookups.
|
||||
|
||||
### Security stack (defense-in-depth)
|
||||
1. **Cookie auth** on `/agent/ask` (same session cookie the tracker issues)
|
||||
2. **Per-user rate limit** (60 req/h default) and **concurrency cap** (1 in-flight)
|
||||
3. **JSONL audit log** at `/var/log/overlord-agent/audit.jsonl` (every prompt + result)
|
||||
4. **CLI flags** — `--allowed-tools` (just our 12 MCP tools), `--disallowed-tools` (Bash, Write, Read, Edit, Agent, ToolSearch, Monitor, scheduling, Gmail/Drive/Calendar, etc.), `--permission-mode dontAsk`
|
||||
5. **`/var/lib/overlord-agent/.claude/settings.json`** — strict deny rules (server-side only, NOT in repo)
|
||||
6. **System-prompt scope rules** in `CLAUDE.md` — instruct the model not to probe, not to suggest workarounds
|
||||
7. **SQL parser** (`sqlglot`) rejects any non-SELECT statement on `query_telemetry_db`
|
||||
8. **Read-only PG role** `overlord_agent_ro` (GRANT SELECT only) — even a parser bypass can't mutate
|
||||
9. **systemd hardening** — `ProtectSystem=strict`, `ProtectHome=read-only`, `InaccessiblePaths=/etc/shadow,/root,~/.ssh,…`, `NoNewPrivileges=true`, `CapabilityBoundingSet=` (empty), `PrivateTmp=true`, `PrivateDevices=true`, `RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6`, `SystemCallFilter=@system-service ~@privileged ~@reboot ~@mount`, `MemoryMax=512M`, `TasksMax=128`
|
||||
10. **Secrets out of /home** — `/etc/overlord/agent.env` (root:overlord-agent 0640) for SECRET_KEY + AGENT_DB_DSN
|
||||
|
||||
### Files
|
||||
|
||||
| Path | What |
|
||||
|------|------|
|
||||
| `agent/service.py` | FastAPI app: `/agent/health`, `/agent/sessions/new`, `/agent/ask`, `/agent/sessions/{id}/history` |
|
||||
| `agent/auth.py` | Session cookie validation (mirrors `main.py:1013-1019`) |
|
||||
| `agent/claude_wrapper.py` | `asyncio.create_subprocess_exec("claude", "-p", …)` with allowed/disallowed-tools |
|
||||
| `agent/tools.py` | Pure tool implementations |
|
||||
| `agent/mcp_overlord.py` | MCP stdio server registering tools |
|
||||
| `agent/sql/0001_overlord_agent_ro.sql` | Read-only PG role |
|
||||
| `agent/overlord-agent.service` | systemd unit (the hardening directives) |
|
||||
| `agent/install.sh` | venv + systemd setup |
|
||||
| `agent/README.md` | Operator's deeper reference |
|
||||
| `.mcp.json` (repo root) | Project-level MCP config Claude Code auto-loads |
|
||||
| `CLAUDE.md` "Overlord Assistant Mode" section | System-prompt briefing |
|
||||
|
||||
### Routing
|
||||
nginx forwards `/api/agent/*` to `127.0.0.1:8767` (the host-side service) with a 300s read/send timeout (suitbuilder runs can be slow). Other `/api/*` continues to the dereth-tracker container at `127.0.0.1:8765`.
|
||||
|
||||
### Cost / quota
|
||||
Subscription auth (no API key); per-call cost is informational only. Each `/agent/ask` invocation = one `claude -p` subprocess with shared session cache. Reactive only — no background polling, no scheduled tasks.
|
||||
|
||||
## Database Schema
|
||||
|
||||
### Telemetry DB (`dereth`, TimescaleDB)
|
||||
|
||||
| Table | Type | Retention | Purpose |
|
||||
| Component | Path | Runs as | Notes |
|
||||
|---|---|---|---|
|
||||
| `telemetry_events` | hypertable | 30 days | Position/stats snapshots |
|
||||
| `spawn_events` | hypertable | 7 days | Monster spawn observations (heatmap source) |
|
||||
| `rare_events` | regular | forever | Rare find history |
|
||||
| `portals` | regular | 1 hour | Discovered portals, dedup by rounded coords |
|
||||
| `char_stats` | regular | forever | Per-character lifetime kill total |
|
||||
| `rare_stats` | regular | forever | Per-character lifetime rare total |
|
||||
| `rare_stats_sessions` | regular | forever | Per-session rare count |
|
||||
| `combat_stats` | regular | forever | Lifetime combat accumulator |
|
||||
| `combat_stats_sessions` | regular | forever | Per-session combat snapshots |
|
||||
| `character_stats` | regular | forever | Latest full stats JSON per character |
|
||||
| `server_status` | regular | forever | Current Coldeve server state (single row) |
|
||||
| **Tracker** (ingest + website + read API + WS) | `go-services/tracker-go/` | Docker `dereth-tracker-go`, 127.0.0.1:8770 | serves the React frontend, login/admin, the plugin `/ws/position`, browser `/ws/live`, and the full read API; writes the `dereth` DB |
|
||||
| **Inventory** (search + suitbuilder + ingestion) | `go-services/inventory-go/` | Docker `inventory-go`, 127.0.0.1:8772 | normalized item search, the suitbuilder solver (SSE), inventory ingestion; writes `inventory_db` |
|
||||
| Telemetry DB | TimescaleDB | Docker `dereth-db`, 5432 | hypertables `telemetry_events`, `spawn_events` |
|
||||
| Inventory DB | postgres:14 | Docker `inventory-db`, 5433 | 7-table normalized item schema |
|
||||
| React frontend | `frontend/` → `static/` | served by `tracker-go` | unchanged by the migration — same paths, same API |
|
||||
| Classic v1 / legacy pages | `static/classic/`, `static/*.html` | served by `tracker-go` | `/classic`, `/suitbuilder.html`, `/inventory.html` |
|
||||
| Grafana | compose `dereth-grafana` | 127.0.0.1:3000 | anonymous Viewer auth, proxied at `/grafana/` |
|
||||
| Discord rare bot | `discord-rare-monitor/` (Python) | Docker, reads Go `/ws/live` | posts rares + relays allegiance chat |
|
||||
| Overlord Agent (assistant) | `agent/` | host-side systemd `overlord-agent`, 127.0.0.1:8767 | shells out to `claude -p`; outside Docker by design |
|
||||
|
||||
### Inventory DB (`inventory_db`, PostgreSQL)
|
||||
**Stack:** Go 1.25 (stdlib `net/http` with 1.22 method+path routing, `pgx/v5`,
|
||||
`coder/websocket`, `bwmarrin/discordgo`, `golang.org/x/crypto/bcrypt`), distroless
|
||||
multi-stage images. React 19 + Vite + TypeScript. PostgreSQL/TimescaleDB. nginx
|
||||
reverse proxy (host-side). Unlike the old single-worker Python service, the Go
|
||||
tracker uses `GOMAXPROCS` = all available cores, so traffic bursts parallelize
|
||||
instead of bottlenecking on one core.
|
||||
|
||||
Normalized schema: `items`, `item_combat_stats`, `item_requirements`, `item_enhancements`, `item_ratings`, `item_spells`, `item_raw_data`.
|
||||
---
|
||||
|
||||
`items.container_id` stores the in-game ID of the container holding the item (0 = character body). The frontend groups items into packs by this ID.
|
||||
## Build & run
|
||||
|
||||
## Operations & Health
|
||||
Everything builds and runs in Docker — **no host Go toolchain needed** (the
|
||||
multi-stage images compile from source). The production stack is the base compose
|
||||
(databases, Grafana, Discord bot) plus two override files for the Go services and
|
||||
the cutover wiring.
|
||||
|
||||
### PostgreSQL tuning
|
||||
`dereth-db` runs with explicit memory overrides in `docker-compose.yml`:
|
||||
- `shared_buffers=8GB` (was 96GB via auto-tune on a 32GB host — caused thrashing)
|
||||
- `effective_cache_size=16GB`
|
||||
- `work_mem=16MB`, `maintenance_work_mem=1GB`
|
||||
- `max_wal_size=4GB`
|
||||
|
||||
### Retention policies
|
||||
- `telemetry_events`: 30-day drop, daily
|
||||
- `spawn_events`: 7-day drop, daily
|
||||
- `portals`: 1-hour cleanup (background task in `main.py`)
|
||||
- `server_health_checks`: **removed** — was write-only, 850K rows of nothing
|
||||
|
||||
### Log levels
|
||||
Both `dereth-tracker` and `inventory-service` run at `LOG_LEVEL=INFO`. Do not set to `DEBUG` in production — it dumps full inventory_delta payloads for every item update (hundreds of KB/sec).
|
||||
|
||||
### Host (Proxmox VM)
|
||||
- 6 vCPU, 32 GiB RAM (of which ~30 GiB is normally free under current load)
|
||||
- Live host: `overlord.snakedesert.se`
|
||||
- Reverse proxy: Nginx on the host terminates TLS and strips the `/api/` prefix before forwarding to port 8765
|
||||
|
||||
### Debug commands
|
||||
```bash
|
||||
docker ps
|
||||
docker logs mosswartoverlord-dereth-tracker-1 --tail 100
|
||||
docker logs mosswartoverlord-inventory-service-1 --tail 100
|
||||
docker logs mosswartoverlord-discord-rare-monitor-1 --tail 100
|
||||
docker exec dereth-db psql -U postgres -d dereth
|
||||
# --- build the Go service images ---
|
||||
export BUILD_VERSION="$(date -u +%Y.%-m.%-d.%H%M)-$(git rev-parse --short HEAD)"
|
||||
docker compose -f docker-compose.yml -f go-services/docker-compose.go.yml \
|
||||
build dereth-tracker-go inventory-go
|
||||
|
||||
# --- production: Go services in write mode, serving the site + ingest ---
|
||||
docker compose -f docker-compose.yml \
|
||||
-f go-services/docker-compose.go.yml \
|
||||
-f go-services/docker-compose.cutover.yml \
|
||||
up -d --no-deps dereth-tracker-go inventory-go
|
||||
```
|
||||
|
||||
## Contributing
|
||||
- `docker-compose.go.yml` defines the Go services (plus the isolated shadow DBs used during the parallel run).
|
||||
- `docker-compose.cutover.yml` flips the Go services to **write mode** against the production DBs (`READ_ONLY=false`, `SKIP_SCHEMA_INIT=true` so they run no DDL and trust the existing schema) and points the Discord bot at the Go `/ws/live`. Drop this file to return the Go services to read-only parallel mode.
|
||||
- `BUILD_VERSION` is shown in the frontend sidebar (CalVer: `YYYY.M.D.HHMM-gitshorthash`).
|
||||
- Required env (server `.env`, **never committed**): `SHARED_SECRET`, `SECRET_KEY`, `POSTGRES_PASSWORD`, `INVENTORY_DB_PASSWORD`, `DISCORD_ACLOG_WEBHOOK`, `DISCORD_RARE_BOT_TOKEN`, the Discord channel IDs, and Grafana admin. See `.env.example`.
|
||||
|
||||
Contributions welcome. Please:
|
||||
- Keep cross-repo protocol changes additive (new optional fields > renames/removes)
|
||||
- Update both this README and `CLAUDE.md` when workflows change
|
||||
- Test end-to-end: plugin → backend → browser for any new event type
|
||||
### Frontend (unchanged by the migration)
|
||||
|
||||
For detailed architecture notes and ongoing investigations, see `CLAUDE.md` and `docs/plans/`.
|
||||
The React app and the legacy static pages call the same absolute paths
|
||||
(`/api/...`, `/inv/...`, `/live`, …) — the Go tracker answers them, so the
|
||||
frontend ships as-is.
|
||||
|
||||
```bash
|
||||
cd frontend && npm run dev # local dev, port 5173, /api → :8770
|
||||
bash deploy-frontend.sh # complete build + copy into static/ (runs npm run build itself)
|
||||
```
|
||||
|
||||
The tracker serves `static/` directly (bind-mounted), so static/JS/CSS changes
|
||||
need no restart. ⚠️ `npm run build` writes to `static/_build/`; only
|
||||
`deploy-frontend.sh` copies it into the served `static/`.
|
||||
|
||||
### nginx
|
||||
|
||||
The live config is host-side at `/etc/nginx/sites-enabled/overlord` (source copy
|
||||
in `nginx/overlord.conf`); the `tracker_go` upstream is in
|
||||
`/etc/nginx/conf.d/tracker_go.conf` (`server 127.0.0.1:8770;`). Production routes
|
||||
`/`, `/api/`, `/websocket/` to the Go tracker. Every location that proxies to the
|
||||
tracker **must** set `X-Forwarded-For` — it drives the internal-trust auth rule.
|
||||
|
||||
### Overlord Agent
|
||||
|
||||
Unchanged by the migration — it's a host-side Python systemd service. Code change:
|
||||
`git pull && sudo systemctl restart overlord-agent`. Its env lives separately at
|
||||
`/etc/overlord/agent.env`. See `agent/` and `CLAUDE.md`.
|
||||
|
||||
---
|
||||
|
||||
## WebSocket contract
|
||||
|
||||
- **`/ws/position`** — plugin → backend. Telemetry, vitals, inventory, portal, rare, combat, quest, chat, share_*, … Authenticated by the `X-Plugin-Secret` header against `SHARED_SECRET` (constant-time; fails closed when unset). The tracker forwards inventory to `inventory-go`, accumulates kill/combat stats, and re-broadcasts to browsers.
|
||||
- **`/ws/live`** — browser ↔ backend. Session-cookie (or internal-trust) authenticated. Accepts `subscribe`, `request_dungeon_map`, and `{player_name, command}` envelopes routed to the matching plugin socket. **Telemetry is broadcast typeless** so the browser ignores it and takes player data from the 5 s `/live` poll (matching the original design — broadcasting it typed flaps the per-player counters).
|
||||
- **Internal-trust rule:** a request skips cookie auth only when its source is private/loopback **and** carries no `X-Forwarded-For`. nginx sets XFF on all internet traffic, so only host-side / compose-network callers qualify.
|
||||
|
||||
### Payload note
|
||||
|
||||
Payloads are snake_case JSON; keep field names and shapes stable across plugin +
|
||||
backend. The plugin sends several numeric telemetry fields as **strings**
|
||||
(`kills_per_hour`, `deaths`, `total_deaths`, `prismatic_taper_count`) — the backend
|
||||
coerces them (`coerceNum` in `tracker-go/reads.go`).
|
||||
|
||||
## Auth & users
|
||||
|
||||
Session cookies are signed with `SECRET_KEY` via an itsdangerous-compatible
|
||||
`URLSafeTimedSerializer` (HMAC-SHA1, 30-day expiry) — cookies interoperate with
|
||||
the legacy Python service. Login at `/login` (bcrypt against the `users` table),
|
||||
admin user CRUD at `/api-admin/users`, current user at `/me`.
|
||||
|
||||
## Databases
|
||||
|
||||
Two separate Postgres databases, both schema-from-code:
|
||||
|
||||
- **`dereth`** (TimescaleDB, `dereth-db`): hypertables `telemetry_events` + `spawn_events`, plus `char_stats`, `combat_stats(_sessions)`, `rare_*`, `portals`, `character_stats`, `users`. Persisted event types: telemetry, spawn, rare, portal, character_stats, combat_stats. Everything else (vitals, quest, cantrips, nearby_objects, dungeon_map, share_*) is memory-only.
|
||||
- **`inventory_db`** (postgres:14, `inventory-db`): 7 normalized tables (`items` + combat/requirements/enhancements/ratings/spells/raw_data).
|
||||
|
||||
In cutover mode the Go services reuse these production databases directly; the
|
||||
shadow DBs in `docker-compose.go.yml` exist only for isolated parallel-run
|
||||
validation. **Backups:** `pg_dump -Fc` of both DBs; TimescaleDB restore needs
|
||||
`timescaledb_pre_restore()` / `post_restore()` around `pg_restore`.
|
||||
|
||||
## Route conventions
|
||||
|
||||
- nginx strips `/api/` before proxying, so backend routes do **not** start with `/api/`.
|
||||
- Hyphenated routes (`/api-version`, `/api-admin/...`) deliberately bypass the strip (they fall through nginx's `location /`).
|
||||
- The static SPA is the catch-all (`GET /`), registered after the API routes, with `index.html` fallback for client-side routing.
|
||||
- `/inv/*` reverse-proxies to the inventory service; `/api/agent/*` is proxied by nginx (not the tracker) to the host-side agent.
|
||||
|
||||
## Operational notes
|
||||
|
||||
- Discord: the rare bot posts rares + relays allegiance chat; **death/idle alerts come from the tracker itself** via `DISCORD_ACLOG_WEBHOOK`.
|
||||
- Issue board persists to the flat file `static/openissues.json` (web-served, mounted read-write).
|
||||
- Logs: `docker logs dereth-tracker-go`, `docker logs inventory-go`. Read-only psql: `docker exec dereth-db psql -U postgres -d dereth`, `docker exec inventory-db psql -U inventory_user -d inventory_db`.
|
||||
- **This repo is PUBLIC** on git.snakedesert.se — never commit secrets. `.env` is gitignored; `.env.example` is the template.
|
||||
|
||||
## Branches
|
||||
|
||||
- **`master`** — the Go production backend (this).
|
||||
- **`python-legacy`** — the original Python/FastAPI implementation, preserved for reference and rollback.
|
||||
|
||||
See [`CLAUDE.md`](CLAUDE.md) for contributor/agent guidance and deeper internals.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue