Commit graph

303 commits

Author SHA1 Message Date
Erik
1294ec4418 feat(go-services): inventory-go search — object_class_name (exact, gem context)
Loads the ObjectClass enum map and adds object_class_name via translate_object_class,
including the context-aware Gem(11) classification (crystal/mana stone/gem/aetheria
by item name, using the original name before the material prefix). The rare
aetheria-by-IntValues path is documented as not reproduced (needs original_json).

Validated vs Python: 0 mismatches over 600 rows (3 queries incl. a 'crystal' text
search that exercises the gem context) for object_class_name, name, material_name,
item_set_name, value.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 12:11:47 +02:00
Erik
50360f72c4 feat(go-services): inventory-go search — name/material/set enrichment (exact)
enrichRows now applies the material-name prefix to name (material is already a
translated string in the DB), sets material_name + original_name, and resolves
item_set_name via the AttributeSetInfo enum (fallback "Set {id}").

Validated vs Python position-by-position: 0 mismatches across 60 armor + 60
jewelry rows for name, material_name, item_set_name, original_name, value,
object_class. Sample names match exactly (e.g. "Gold Alduressa Coat").

Remaining enrichment slices: object_class_name (gem context), spells/spell_names
(needs the spells enum map), slot_name (sophisticated), weapon damage/speed/mana,
rating gear-total fallbacks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 12:02:17 +02:00
Erik
c04cfaf2c6 feat(go-services): inventory-go search — computed_slot_name + slot/weapon/set filters
Adds the full computed_slot_name CASE (EquippableSlots decode, jewelry by
wielded-location, weapons, cloak) and the remaining SQL filters: weapon_type
(skill-id EXISTS), slot_names (per-slot OR clauses), item_set/item_sets
(translate_equipment_set_id, bug-for-bug).

Validated vs Python (total_count EXACT): weapon_type heavy/bow/caster (473/138/
474), slot_names ring/neck/cloak (1286/1428/220), item_set 13 (526). The
computed_slot_name VALUES match exactly (slot distribution identical: Head 721,
Hands 458, Feet 403, Chest 376, ...).

Two documented edge-case discrepancies, both Python main-vs-count CTE
inconsistencies (Python's count query uses a SIMPLIFIED slot CASE where armor ->
'Armor', so its own total_count disagrees with its item list): slot_names with
armor slot names, and sort_by=slot_name empty-string ordering. Our consistent
single-CASE implementation is arguably more correct; reconcile to Python's count
CTE later if strict parity on those is required.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 11:56:10 +02:00
Erik
7e7842128e feat(go-services): inventory-go /search/items — query+filters+count (validated)
Ports the search CTE (items_with_slots: combat/req/enh joins + the rating
extractions from item_raw_data.int_values JSONB via GREATEST/COALESCE, coverage
mask, computed_spell_names), the SQL filters, sort mapping, pagination, and the
DISTINCT count query. Returns each row's direct DB columns + computed booleans
(is_equipped/bonded/attuned/rare, condition_percent).

Validated vs the Python service on the production DB: total_count EXACT across
13 filter combinations (armor/jewelry/min_armor/min_damage/text/material/
min_value/is_rare/rating/equipped/character/workmanship), and 50-row alignment
with 0 direct-column mismatches (same SQL sort order, same rows).

Deferred to later slices: deep per-row enrichment (extract_item_properties:
material_name/spells/slot_name/object_class_name/...), and the enum-dependent
filters (has_spell/spell_contains/legendary_cantrips, slot_names, item_set,
weapon_type, underwear).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 11:45:34 +02:00
Erik
253250a01d feat(go-services): inventory-go Phase A — read-side scaffold + simple endpoints
First slice of the inventory-service port, running in parallel READ-ONLY against
the production inventory_db (never written):
- main.go/store.go: pgx pool (forced read-only), enum-DB loader extracting
  AttributeSetInfo for set-name resolution, /health, /sets/list, /characters/list.
- Dockerfile + compose service inventory-go (127.0.0.1:8772, enum JSON mounted).

Validated vs the Python service on the same DB: /characters/list 167 chars exact
counts; /sets/list 76 sets EXACT match (ids, names, counts).

Remaining (large): /search/items (40+ filters + enrich_db_item), inventory
fetch, item-processing ingestion (extract_item_properties), and the suitbuilder
solver.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 11:33:55 +02:00
Erik
5b2db439a3 feat(go-services): tracker share_* handlers (complete ingest) + shadow tuning
- share.go: cross-machine vital sharing (share_subscribe/unsubscribe/share_*),
  faithful port of the peer-state snapshot + plugin fan-out + /vital-sharing/peers.
  The last ingest handler — the Go tracker now handles every plugin event type.
- shadow consumer: drop the outbound keepalive ping (the firehose is never idle)
  and tighten the read-deadline watchdog to 12s for faster reconnect after the
  upstream's periodic eviction (full-firehose browser clients get evicted ~every
  90s; the watchdog recovers it, ~90% duty cycle). Production-bound /ws/position
  is unaffected (plugins connect to us; no eviction).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 11:27:25 +02:00
Erik
27757636e4 feat(go-services): tracker WS servers (/ws/position + /ws/live) + robust shadow
Completes the Go tracker as a cutover-ready drop-in:
- wslive.go: browser broadcast hub with per-client subscribe filters (nil=all),
  request_dungeon_map replies, and command routing; auth = internal-trust or
  session cookie. The ingestor broadcasts every handled event to it.
- wsposition.go: plugin ingest server with X-Plugin-Secret/SHARED_SECRET auth
  (constant-time, fails closed, legacy fallback), register -> plugin_conns, and
  dispatch into the shared Ingestor. plugin registry for backend->plugin commands.
- main.go: statusRecorder.Unwrap() so coder/websocket can hijack through the
  logging middleware (WS handshakes failed without it); /ws/ bypasses HTTP auth.

Shadow consumer robustness (the harness was being evicted under the full
firehose): decouple socket read from processing — the read loop only copies raw
frames to a queue; a worker unmarshals + dispatches. JSON parsing in the read
loop was slowing it enough that Python's broadcast send errored and evicted us
(Read then blocked forever). Added a 25s read-deadline watchdog to self-heal.

Validated live: shadow /live online = 73 = production; telemetry sustained ~12/s,
0 drops, no eviction; and the shadow's /ws/live re-broadcast stream is IDENTICAL
to production's (TOTAL 2150=2150, every event type exact).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 11:15:05 +02:00
Erik
7350b00341 feat(go-services): Phase 2 — combat_stats accumulator (cross-language exact)
Ports main.py's _combat_session_delta / _combat_merge_into_lifetime (incl. the
documented "offense/defense use latest, additively" quirk) and the combat_stats
handler (session delta -> DB-backed lifetime merge -> delete-then-insert of
combat_stats + combat_stats_sessions). Read handlers gain the live combat
overlay (union live + DB), like Python.

Validation:
- combat.go `combat-merge` CLI folds snapshots through the accumulator; diffed
  against the Python functions on identical input -> byte-IDENTICAL.
- combat_test.go golden test runs in the build (go test now part of the tracker
  Dockerfile).
- Live: 40 combat lifetime rows + 40 session snapshots + rare_events flowing in
  dereth_go via the shadow consumer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 10:42:15 +02:00
Erik
a5d69ba88d feat(go-services): Phase 2 ingest — shared Ingestor + shadow consumer
Implements the plugin event handlers (the /ws/position write logic) as a shared
Ingestor, validated against real traffic by replaying Python's /ws/live firehose
into an isolated dereth_go DB (no production write, no plugin stolen).

- ingest.go: faithful ports of telemetry (kill-delta -> char_stats, server
  received_at stamp), rare (rare_stats/rare_stats_sessions/rare_events), portal
  (coord upsert), character_stats (stats_data JSONB subset + upsert), spawn, and
  the memory-only handlers (vitals/quest/equipment_cantrip/nearby/dungeon). In
  -memory live state + read-side overlay accessors.
- shadow.go: coder/websocket consumer of /ws/live -> Ingestor.dispatch (telemetry
  matched by shape since its broadcast has no type field).
- main.go/store.go: ingest mode (READ_ONLY=false + SHADOW_INGEST_WS) wires the
  ingestor; read handlers (/character-stats, /equipment-cantrip, /quest-status)
  now consult the live overlay first, like Python.
- compose: shadow instance ingests ws://dereth-tracker:8765/ws/live.

Validated live: dereth_go has 73 distinct telemetry chars; shadow /live online
set == production (73=73); character_stats 5/5 exact byte-match (0 mismatch);
char_stats kill-deltas + portals accumulating. compare/compare_ingest.py.

Deferred to next pass: combat_stats (delta/merge), share_*, the /ws/position +
/ws/live servers (for cutover).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 10:31:15 +02:00
Erik
6a839e69bc feat(go-services): Phase 2 foundation — isolated shadow DB + schema
Stands up the shadow-ingest substrate without touching production:
- schema.go: faithful replica of db_async.init_db_async (idempotent DDL),
  run only when an instance OWNS its DB (READ_ONLY=false). Fixes for a fresh DB:
  spawn_events has no sole-id PK (so it can be a hypertable), telemetry_events
  compression is enabled before its policy, and the portal unique index uses
  ROUND(..,1) to match main.py's ON CONFLICT. 35/35 statements OK.
- store.go: read-only transaction enforcement is now conditional (on for
  production read parity, off for ingest).
- main.go: READ_ONLY + SHADOW_INGEST_WS config; schema init on boot when owning
  the DB.
- compose override: a SEPARATE TimescaleDB `dereth-go-db` (isolated volume,
  127.0.0.1:5434) and a `dereth-tracker-go-shadow` instance (image reused via
  dereth-tracker-go:local) that owns it. Production DB never written.

Verified: dereth_go has all 13 tables; telemetry_events + spawn_events are
hypertables; the read-side instance still serves production read-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 10:18:30 +02:00
Erik
b6d2871cf0 chore(go-services): add .gitattributes (force LF) to stop CRLF churn
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 10:07:20 +02:00
Erik
8d4c6ff68f feat(go-services): discord-go — Go port of discord-rare-monitor
Consumes the Python tracker's /ws/live firehose (subscribed to rare+chat),
classifies rares common/great, posts embeds + relays allegiance chat.

- classify.go: the 74-name common-rares set, extracted verbatim from the Python
  COMMON_RARES_PATTERN (not hand-transcribed). go test runs at build time; a
  server-side dump-rares vs the Python regex confirms the sets are IDENTICAL.
- poster.go: a `poster` interface with a real discordgo impl (REST sends by
  channel id; gold/blue embeds, location/time fields, icon attachment) and a
  dry-run log impl.
- ws.go: coder/websocket client to /ws/live with subscribe, ping keepalive,
  exponential-backoff reconnect; rare/chat dispatch incl. vortex-warning + the
  MONITOR_CHARACTER filter.
- SAFE BY DEFAULT: dry-run unless a token AND DRY_RUN=0 are set, so it can never
  double-post to production. Deployed via the compose override
  (discord-rare-monitor-go), running dry-run against the same live firehose.

Validated on the server: connects, subscribes, relays a real chat in dry-run;
classifier parity 74/74 vs the Python regex.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 10:06:59 +02:00
Erik
426fe025d3 chore(go-services): ready-to-apply nginx /go/ snippet (user must sudo)
The agent cannot sudo (password required), so nginx deploy is a user step.
go-services/nginx/go-location.conf holds the `location /go/` block + the
`upstream tracker_go` line with apply instructions. Not required for the
parallel run (the Go service is parity-verified on loopback); this is for
browser-reachable /go/ access. Live overlord.conf has drifted from the repo
copy — reconcile by hand, don't cp-overwrite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 09:51:25 +02:00
Erik
bf15d4a2f7 feat(go-services): tracker-go — auth gate (itsdangerous + internal-trust)
Replicates main.py's AuthMiddleware so /go/ can be exposed safely:
- internal-trust: private source IP AND no X-Forwarded-For => skip auth
  (loopback/compose callers; nginx adds XFF to all internet traffic).
- session cookie: byte-compatible itsdangerous URLSafeTimedSerializer verify
  (HMAC-SHA1, django-concat key derivation sha1("itsdangerous"+"signer"+key),
  Unix-epoch timestamp, urlsafe-b64 no pad, optional zlib payload), keyed on the
  same SECRET_KEY. 30-day max-age. Public allowlist (/login,/logout,login assets,
  /icons/,/health); 302->/login for html, 401 JSON otherwise.

Validated on the server: internal-trust loopback 200; external no-cookie 401;
html 302; valid cookie 200; tampered 401; /health public 200; and the SAME
Python-issued cookie authenticates BOTH services (cross-compat proof).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 09:48:47 +02:00
Erik
c4e8190656 feat(go-services): tracker-go — complete the Phase 1 read API
Adds the rest of the read-side endpoints to the Go tracker, all parity-checked
against the live Python service:

- DB reads: /stats/{c}, /portals, /spawns/heatmap, /server-health,
  /character-stats/{c} (stats_data JSONB merged to top level),
  /combat-stats[/{c}], /inventories, /inventory/{c}/search.
- 5-minute totals cache + /total-rares, /total-kills.
- Ingest-only state returned as Python's empty/default shapes (/quest-status,
  /vital-sharing/peers, /equipment-cantrip-state/{c}); /issues (flat file),
  /me (401 until cookie verification lands).
- Streaming reverse proxy to inventory-service (/inventory/{c},
  /inventory-characters, /search/*, /sets/list, /inv/{path...} incl. the SSE
  suitbuilder stream).
- compare/compare_endpoints.py: structural parity for all read endpoints +
  exact-match check for /character-stats and /combat-stats on OFFLINE chars
  (online chars legitimately differ — Python serves a richer live overlay that
  Phase-1 Go lacks until ingest).

Verified live: 14/14 endpoints structural-match, 8/8 rich offline chars
exact-match on /character-stats.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 09:38:10 +02:00
Erik
1af47520c0 feat(go-services): tracker-go Phase 0/1 — /live + /trails read parity
Parallel Go reimplementation of the dereth-tracker read side, deployed
loopback-only (:8770) and reading the dereth TimescaleDB read-only. The live
Python stack is untouched (added via a compose override, not by editing the
tracked docker-compose.yml).

- Phase 0 scaffold: stdlib net/http server (Go 1.22+ method+path routing),
  /health + /api-version, multi-stage distroless Docker build, and
  go-services/docker-compose.go.yml override (loopback :8770).
- Phase 1: pgx v5 pool forced into read-only transactions, a 5s /live + /trails
  cache loop using the exact main.py:837 SQL, and Python-isoformat timestamps
  so output matches FastAPI's jsonable_encoder.
- compare/compare_live.py: parity harness vs the live Python service. Uses the
  server-stamped received_at to prove same-row full-field equality and to make
  the online-set diff boundary-aware.

Verified on live traffic (73 players): identical online set + 23-key schema,
identity/type parity for all, every same-row pair matches on every field, and
diff-row pairs differ only by the ~6s two-cache refresh skew.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 09:24:22 +02:00
Erik
b8fd449d62 docs(go-prompt): inventory-service + discord bot are also Go rewrite targets
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-24 08:51:34 +02:00
Erik
47607d75fb docs: add fresh-session prompt for the parallel Go backend rewrite
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-24 08:31:47 +02:00
Erik
6a0bb9fe80 feat(sidebar): restore the rickroll title-click easter egg
Holiday's over — revert the Sma Grodorna frog-hop title gag back to the
original /rick.mp4 fullscreen rickroll + shake/spin it replaced.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-24 08:15:21 +02:00
Erik
cc686da532 feat(midsummer): retire the theme out of season (holiday over)
Flip the seasonal master switch (SEASON_ACTIVE=false in useMidsummer) so the
Sma Grodorna theme is fully dormant — no rain/frogs/maypole/banner/palette,
regardless of any stored toggle preference — and remove the 🐸 toggle from the
sidebar. All theme code is kept; to bring it back next Midsummer, flip
SEASON_ACTIVE to true and re-add <FrogToggle /> in SidebarWindowButtons.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-24 08:12:00 +02:00
Erik
0565a54ae5 fix(live): window 'online' on server receive-time, not client clock
The player-count flapping was client clock skew: telemetry is stamped with the
game machine's DateTime.UtcNow (WebSocket.cs), and machines' clocks drift up to
~90s apart (proven: per-char offsets span -31s..+59s with steady 6s cadence; a
wrong server clock would shift all equally, so the SPREAD proves clients differ
from each other; a +59s future timestamp rules out lag). /live windowed on that
client timestamp, so characters whose clock sat near the 30s boundary blinked
in and out.

Fix: stamp each telemetry row with the server's receive-time (received_at) and
window the /live 'online' query on COALESCE(received_at, timestamp) instead of
the client timestamp. A coarse timestamp bound (10 min) is kept only for
TimescaleDB chunk pruning. Column added idempotently in init_db_async; COALESCE
falls back to the client timestamp for pre-migration rows. Verified on the live
DB: query valid, 8ms, equivalent pre-population. ~free CPU (one datetime.now()
per ~14 inserts/sec).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-23 23:34:35 +02:00
Erik
645feef9aa perf(inventory): cap concurrent forwards so flush bursts can't starve telemetry
Root cause of the player-count flapping: the plugin's debounced inventory
flush, combined with a fleet-wide relog wave (auto-update) phase-aligning the
60s flush timers, produced a synchronized burst of inventory forwards every
cycle. The burst flooded the single event loop + httpx pool (errors in
_do_handle_inventory_delta even though inventory-service was idle), periodically
starving telemetry ingest (cliff 116→5 rows/10s) so characters aged out of the
30s window and the count flapped.

- Global asyncio.Semaphore(8) around inventory forwarding: a burst can never
  monopolize the loop; telemetry always gets through.
- Tighten the shared httpx client (max_connections=10, keepalive=5, 5s timeout)
  so a stale/slow connection can't hold a slot.

Pairs with the plugin-side flush-timer jitter (2–5 min, re-rolled per tick) that
de-synchronizes the fleet at the source.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-23 22:18:34 +02:00
Erik
349c15d944 perf(broadcast): serialize once with stdlib json, drop jsonable_encoder from hot path
Every browser broadcast ran jsonable_encoder (slow recursive encode) and then
re-serialized per client via send_json — so a payload to N browsers was
encoded N+1 times, on the same single event-loop core that the telemetry/
inventory firehose already saturates.

Now serialize ONCE with json.dumps + a datetime-aware default (_json_default
mirrors jsonable_encoder for the types that actually appear: datetime, Enum,
Decimal, set, bytes) and send the prebuilt string to every client via
send_text. Verified the wire output parses identically to the old path.
Pure backend change — no plugin, no frontend, no schema change; stdlib only
so it deploys via restart with no image rebuild / dependency churn.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-23 21:40:52 +02:00
Erik
d86bc48862 feat(midsummer): rain of flowers/frogs/Swedish flags, dots become frogs, drop jingle
Per request: remove the WebAudio jingle (+ its 🔊 toggle and sound state);
replace the one-shot confetti with a continuous rain of 🌼🌸🐸🇸🇪🌿 over the
screen (MidsummerRain, gated by the theme, reduced-motion aware, leak-free);
and replace player-dot markers with frogs themselves (override the inline
dot color/border) instead of a flower-crown on top. Still toggled by the
🐸 Midsommar switch. Includes rebuilt static bundle.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-19 09:47:39 +02:00
Erik
7141a38c5c build(midsummer): deploy Sma grodorna theme to static bundle
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-19 09:34:26 +02:00
Erik
1f86e7cc86 polish(midsummer): guard frog-hop against rapid re-click stacking
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-19 09:33:51 +02:00
Erik
3cd2165c15 feat(midsummer): WebAudio Sma grodorna jingle, plays once on first gesture 2026-06-19 09:30:00 +02:00
Erik
e896ef1f21 feat(midsummer): frog-hop easter egg replaces the rickroll 2026-06-19 09:29:07 +02:00
Erik
c4dd1b7ae7 feat(midsummer): glad midsommar banner + one-shot confetti 2026-06-19 09:28:34 +02:00
Erik
e7b0f11bb1 feat(midsummer): flower-crown dots, frog on selected 2026-06-19 09:27:40 +02:00
Erik
da0cc79def feat(midsummer): dancing maypole pinned to map centre 2026-06-19 09:27:26 +02:00
Erik
2fb6fd2f3e feat(midsummer): sidebar frog toggle + jingle toggle (sound stubbed) 2026-06-19 09:26:45 +02:00
Erik
580fd6fbc5 feat(midsummer): pond-green palette overlay for sidebar and map 2026-06-19 09:26:12 +02:00
Erik
568992d0f9 feat(midsummer): theme state provider + data-midsummer attribute 2026-06-19 09:25:54 +02:00
Erik
e803c35af9 docs(plan): Sma Grodorna midsummer theme implementation plan (+ spec: WebAudio jingle)
9-task plan with complete code for the frog/maypole theme: scoped CSS
overlay, useMidsummer provider, dancing maypole, crown/frog dots, banner +
confetti, frog-hop easter egg, WebAudio jingle. Spec updated to synthesize
the jingle (no mp3 asset / licensing).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-19 09:22:54 +02:00
Erik
b3753d1ab0 docs(spec): Sma Grodorna midsummer theme design
Full-takeover frog/maypole midsummer theme for the React frontend:
scoped [data-midsummer] CSS overlay, useMidsummer hook (localStorage,
default on), dancing maypole inside the map pan/zoom group, frog +
flower-crown dots, Glad midsommar banner + confetti, frog-hop easter egg
replacing the rickroll, play-once unmuted jingle. Manual 🐸 toggle.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-19 09:15:31 +02:00
Erik
52bf9342df feat: SHARED_SECRET_LEGACY migration escape hatch for plugin secret rollout
Accepts one legacy secret alongside the real one so existing clients keep
registering while game machines migrate to websocket_secret.txt. Remove
SHARED_SECRET_LEGACY from .env after the rollout.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 20:20:19 +02:00
Erik
15ae870117 docs: CLAUDE.md reflects env-based SHARED_SECRET and XFF internal-trust rule
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 17:08:39 +02:00
Erik
a28b61511c security: enforce real plugin secret, fix proxy auth bypass, loopback DB ports, nightly backups
- SHARED_SECRET now read from env and fail-closed: unset/placeholder refuses
  ALL plugin connections (constant-time compare). The old hardcoded
  'your_shared_secret' in this public repo was no auth at all. Dockerfile
  default removed; generate_data.py reads the env var.
- SECRET_KEY fails closed at startup (main.py and agent/auth.py) instead of
  falling back to a publicly-known signing key; agent systemd unit now
  requires /etc/overlord/agent.env (no '-' prefix).
- AuthMiddleware + /ws/live: replace the 172.x source-IP trust (which every
  nginx-proxied internet request satisfied via docker-proxy — full session
  bypass and unauthenticated in-game command injection) with
  private-source AND no X-Forwarded-For, i.e. only genuinely internal
  callers (overlord-agent on the host, compose-network services). Invariant
  documented in nginx/overlord.conf: every tracker-bound location must set
  X-Forwarded-For.
- /character-stats/test endpoints gated behind admin (they upsert real rows).
- docker-compose: bind 5432/5433 to 127.0.0.1 (both DBs were internet-
  reachable; active brute-force observed in dereth-db logs).
- discord-rare-monitor: drop dead SHARED_SECRET constant.
- scripts/backup-databases.sh + docs/backups.md: nightly pg_dump of both DBs
  (telemetry/spawn hypertable data excluded), 10MB canary, umask 077,
  TimescaleDB restore procedure.
- Remove stray mangled-path css file from repo root.

Adversarially reviewed pre-deploy (3-lens workflow): ship verdict; deploy-
sequencing blockers addressed (secret staged before enforcement, exec bit
set, cron uses bash).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 17:02:47 +02:00
Erik
c6a1af0c39 docs: rewrite CLAUDE.md from audit — drop stale 2025 fix journal
The old file was ~half September-2025 changelog with claims now wrong:
portal race condition (fixed via ON CONFLICT upsert), hypertable warning
(telemetry_events IS a hypertable on live), pool 5-20 (actually 5-100),
--no-cache rebuild for code changes (bind mounts + restart suffice),
/ws/live unauthenticated (cookie-authenticated since), static-HTML
frontend description (React since). Rewritten around current reality:
component map, WS endpoints + auth caveats, two-database schema-as-code
situation (alembic empty, manual ALTERs), route conventions, React
deploy flow, operational notes. Overlord Assistant Mode section
preserved verbatim (consumed at runtime by the agent service).

AGENTS.md: remove nonexistent GET /history, fix /ws/live auth claim.
Also delete stray mangled-path file from repo root.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 16:36:01 +02:00
Erik
a5b80fd9cd security(nginx): remove dead Grafana service-account token from committed config
The glsa_ token on the /grafana/ location was committed to a public repo.
Verified dead: Grafana's service-account and api_key tables are empty (the
data dir is ephemeral container storage, so the SA was wiped on a past
recreate) and an arbitrary invalid bearer gets identical 200 responses —
panel embeds are actually served by anonymous Viewer auth
(GF_AUTH_ANONYMOUS_ENABLED=true). The header was a no-op; removing it
changes no behavior and removes the credential from the config.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 16:25:20 +02:00
Erik
87e4f2ff62 fix(dashboard): table width:auto so Character column sizes to content only
With width:100%, the table stretched to fill the container — and the
Character column (the only one with stretchable text) absorbed all the
extra space, looking much wider than the longest name needed.

width:auto lets each column size to its content. Table now fits its
data; container still scrolls horizontally if content ever exceeds the
viewport.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-23 19:37:49 +02:00
Erik
5f43ddce93 feat(dashboard): click-to-highlight rows + character column auto-sizes
Two small UX improvements to the Player Dashboard table (works in both
the new-tab fullscreen page and the deprecated in-app window since they
share PlayerDashboardContent):

1. Row highlight: click anywhere on a row to highlight it (blue tint +
   thin outline). Click again to unhighlight. Single selection — useful
   for tracking one character down a long sorted list.

2. Character column no longer truncates: removed maxWidth/overflow/
   textOverflow on the name cell. Column now sizes to the longest
   character name (still no wrapping; container scrolls horizontally
   if names are extreme).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-23 19:35:09 +02:00
Erik
5bda2b64f4 feat(dashboard): open Player Dashboard in a new tab
The 👥 Dashboard button used to open the player table as a draggable
in-app window, which competed for screen space with the map. It now
opens in a separate browser tab as a fullscreen page so users can put
the dashboard on a second monitor.

How:
- App.tsx branches on ?view=dashboard → renders PlayerDashboardFullPage
  (new file in components/) instead of the default MapLayout.
- SidebarWindowButtons.tsx: 👥 Dashboard onClick now does
  window.open('/?view=dashboard', '_blank', 'noopener'). Label shows
  '↗' so users know it's an external open.
- PlayerDashboardWindow.tsx refactored: extracted the sortable table
  body into a reusable PlayerDashboardContent component. The old window
  shell stays registered in WindowRenderer for backward compat — just
  no longer reachable from the default sidebar.
- map-layout.css: new .ml-dashboard-page rules for fullscreen layout.

Each tab gets its own useLiveData + WebSocket connection (server
already handles multiple browser clients). The new tab inherits the
session cookie from the original tab — no re-login.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-23 19:31:26 +02:00
Erik
3cf6437617 fix(auth): populate request.state.user inside loopback-bypass branch
The Docker-bridge / loopback bypass in AuthMiddleware was short-circuiting
the whole auth flow without ever decoding the session cookie. Result: /me
and other endpoints reading request.state.user got 401 for real logged-in
browsers (because nginx → docker-proxy makes them look like 172.x).

Symptom: dashboard admin UI invisible even for admin users — useCurrentUser
saw 401 from /me and treated everyone as anonymous.

Fix: in the bypass branch, still try to decode any session cookie present
and populate request.state.user. The bypass still permits anonymous
internal calls (overlord-agent's MCP tools), but real authenticated browsers
now get their user correctly resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 20:17:22 +02:00
Erik
1c1c43d28b feat(dashboard): logout button + admin user-management window
Logout: new sidebar link 'Log out (username)' that POSTs /api/logout
(clears session cookie) and navigates to /login. Visible to everyone.
Replaces 'no logout functionality' state where users could only get
out by deleting cookies manually.

Admin window: new 'Admin · Users' window (only shown when current
user.is_admin) lists all users in a table with:
  - Add user (username + password + admin checkbox)
  - Reset password inline per row
  - Toggle admin per row
  - Delete user per row (blocked for self)
Wraps the existing /api-admin/users CRUD endpoints in main.py.

Plumbing: useCurrentUser hook fetches /me on mount; apiPatch+apiDelete
helpers added to api/client.ts; new endpoint wrappers exported from
api/endpoints.ts; AdminUsersWindow.tsx registered in WindowRenderer
under id prefix 'adminusers'; CSS for admin table/form/buttons and
the muted-red logout link.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 20:10:10 +02:00
Erik
88e9e88f46 docs(agent): brief Claude on AC rare tier classification
The user kept asking 'show me great rares' and Claude kept showing
Crystals/Pearls/Jewels because the rare_events table doesn't store the
tier — and Claude didn't know the distinction. Now CLAUDE.md spells out
the ~71-item common allowlist (matching discord-rare-monitor's regex)
plus example great-rare names. Includes a sample SQL query Claude can
adapt for tier filtering.
2026-04-25 23:07:57 +02:00
Erik
1196746dbe fix(agent): SQL parser robust against sqlglot version drift
The query_telemetry_db tool was crashing with AttributeError because
exp.AlterTable doesn't exist in this sqlglot version (renamed to Alter).
Made the deny-class list build defensively via getattr and dropped any
classes that the installed sqlglot doesn't expose.

Also broadened the deny list (Alter, AlterColumn, AlterDatabase, Truncate,
Grant, Revoke, Copy) and made the toplevel allowlist tolerant of missing
classes too. The walk() return shape is also normalized in case sqlglot
versions yield (node, parent, key) tuples vs. bare nodes.

Belt-and-suspenders is fine — the GRANT-SELECT-only PG role is the real
write barrier; the parser is just a faster/friendlier reject path.
2026-04-25 23:07:00 +02:00
Erik
6de4bfe03e docs: README updated for Overlord Agent — host-side service, security stack, deploy flow
Adds an AI Assistant section covering:
- Architecture (host-side vs Docker, dedicated overlord-agent user)
- The 12 MCP tools available to the model
- 10-layer security stack (cookie auth, rate limit, audit log, allowed/disallowed tools, settings.json deny, system prompt, SQL parser, RO PG role, systemd hardening, dedicated UID)
- Full file inventory under agent/
- Routing via nginx /api/agent/*
- Cost / quota notes (subscription auth, reactive only)

Plus features-section blurb, agent env vars table (/etc/overlord/agent.env),
and deploy-flow recipes (code-only restart, requirements update, unit
file change, first-time install).
2026-04-25 22:50:54 +02:00
Erik
0633865598 fix(agent): block Agent + Gmail/Drive/Calendar tools, brief model not to probe
Two complementary changes after observing the model probe boundaries
(it tried mcp__claude_ai_Gmail__search_threads, then tried to delegate
to a subagent via the Agent tool, then suggested the user edit
settings.local.json to add Gmail tools):

1. claude_wrapper.py adds to --disallowed-tools:
   - Agent  (subagent spawning — should never delegate)
   - WebFetch (already; settings.json re-allows acpedia.org only)
   - Every Gmail/Calendar/Drive connector tool name we know about

2. CLAUDE.md adds a 'Non-negotiable scope rules' section:
   - Be a read-only game-state QA service, nothing else
   - Don't attempt tools outside your role
   - Don't explain how to bypass restrictions
   - Don't suggest settings.json edits
   - Don't enumerate hidden tools when asked

Soft (system-prompt) + hard (CLI flag) defenses combined.
2026-04-25 22:45:39 +02:00