Commit graph

259 commits

Author SHA1 Message Date
Erik
3cf6437617 fix(auth): populate request.state.user inside loopback-bypass branch
The Docker-bridge / loopback bypass in AuthMiddleware was short-circuiting
the whole auth flow without ever decoding the session cookie. Result: /me
and other endpoints reading request.state.user got 401 for real logged-in
browsers (because nginx → docker-proxy makes them look like 172.x).

Symptom: dashboard admin UI invisible even for admin users — useCurrentUser
saw 401 from /me and treated everyone as anonymous.

Fix: in the bypass branch, still try to decode any session cookie present
and populate request.state.user. The bypass still permits anonymous
internal calls (overlord-agent's MCP tools), but real authenticated browsers
now get their user correctly resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 20:17:22 +02:00
Erik
1c1c43d28b feat(dashboard): logout button + admin user-management window
Logout: new sidebar link 'Log out (username)' that POSTs /api/logout
(clears session cookie) and navigates to /login. Visible to everyone.
Replaces 'no logout functionality' state where users could only get
out by deleting cookies manually.

Admin window: new 'Admin · Users' window (only shown when current
user.is_admin) lists all users in a table with:
  - Add user (username + password + admin checkbox)
  - Reset password inline per row
  - Toggle admin per row
  - Delete user per row (blocked for self)
Wraps the existing /api-admin/users CRUD endpoints in main.py.

Plumbing: useCurrentUser hook fetches /me on mount; apiPatch+apiDelete
helpers added to api/client.ts; new endpoint wrappers exported from
api/endpoints.ts; AdminUsersWindow.tsx registered in WindowRenderer
under id prefix 'adminusers'; CSS for admin table/form/buttons and
the muted-red logout link.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 20:10:10 +02:00
Erik
88e9e88f46 docs(agent): brief Claude on AC rare tier classification
The user kept asking 'show me great rares' and Claude kept showing
Crystals/Pearls/Jewels because the rare_events table doesn't store the
tier — and Claude didn't know the distinction. Now CLAUDE.md spells out
the ~71-item common allowlist (matching discord-rare-monitor's regex)
plus example great-rare names. Includes a sample SQL query Claude can
adapt for tier filtering.
2026-04-25 23:07:57 +02:00
Erik
1196746dbe fix(agent): SQL parser robust against sqlglot version drift
The query_telemetry_db tool was crashing with AttributeError because
exp.AlterTable doesn't exist in this sqlglot version (renamed to Alter).
Made the deny-class list build defensively via getattr and dropped any
classes that the installed sqlglot doesn't expose.

Also broadened the deny list (Alter, AlterColumn, AlterDatabase, Truncate,
Grant, Revoke, Copy) and made the toplevel allowlist tolerant of missing
classes too. The walk() return shape is also normalized in case sqlglot
versions yield (node, parent, key) tuples vs. bare nodes.

Belt-and-suspenders is fine — the GRANT-SELECT-only PG role is the real
write barrier; the parser is just a faster/friendlier reject path.
2026-04-25 23:07:00 +02:00
Erik
6de4bfe03e docs: README updated for Overlord Agent — host-side service, security stack, deploy flow
Adds an AI Assistant section covering:
- Architecture (host-side vs Docker, dedicated overlord-agent user)
- The 12 MCP tools available to the model
- 10-layer security stack (cookie auth, rate limit, audit log, allowed/disallowed tools, settings.json deny, system prompt, SQL parser, RO PG role, systemd hardening, dedicated UID)
- Full file inventory under agent/
- Routing via nginx /api/agent/*
- Cost / quota notes (subscription auth, reactive only)

Plus features-section blurb, agent env vars table (/etc/overlord/agent.env),
and deploy-flow recipes (code-only restart, requirements update, unit
file change, first-time install).
2026-04-25 22:50:54 +02:00
Erik
0633865598 fix(agent): block Agent + Gmail/Drive/Calendar tools, brief model not to probe
Two complementary changes after observing the model probe boundaries
(it tried mcp__claude_ai_Gmail__search_threads, then tried to delegate
to a subagent via the Agent tool, then suggested the user edit
settings.local.json to add Gmail tools):

1. claude_wrapper.py adds to --disallowed-tools:
   - Agent  (subagent spawning — should never delegate)
   - WebFetch (already; settings.json re-allows acpedia.org only)
   - Every Gmail/Calendar/Drive connector tool name we know about

2. CLAUDE.md adds a 'Non-negotiable scope rules' section:
   - Be a read-only game-state QA service, nothing else
   - Don't attempt tools outside your role
   - Don't explain how to bypass restrictions
   - Don't suggest settings.json edits
   - Don't enumerate hidden tools when asked

Soft (system-prompt) + hard (CLI flag) defenses combined.
2026-04-25 22:45:39 +02:00
Erik
e780f249d1 fix(agent): keep strict permissions server-side, not in repo
The previous commit put .claude/settings.json IN THE REPO, which would
have applied its strict deny rules to ANY Claude Code invocation from
this cwd — including the human user's interactive dev sessions on their
own machine. That's wrong; the production agent's lockdown should not
constrain the developer.

Remove the committed file and gitignore .claude/ entirely. The repo is
permission-neutral now.

Strict permissions for the production agent come from two server-only
sources:
  1. CLI flags in agent/claude_wrapper.py (--allowed-tools +
     --disallowed-tools, passed by the systemd-spawned subprocess only)
  2. /var/lib/overlord-agent/.claude/settings.json (the agent's own HOME
     — separate from any user's .claude/)

Also bumps claude_wrapper.py with the explicit --disallowed-tools list
of meta-tools (ToolSearch, Monitor, TodoWrite, TaskOutput, Skill, cron
tools, etc.) that the --allowed-tools whitelist does not block on its
own. Verified empirically: with only --allowed-tools, ToolSearch was
still callable; --disallowed-tools is required.
2026-04-25 22:26:02 +02:00
Erik
f894399165 feat(agent): isolate from erik — dedicated overlord-agent user
The agent service was running as User=erik, which meant:
- Sessions polluted erik's ~/.claude/projects/
- erik's .claude/settings.local.json (months of accumulated dev permissions
  for docker/git/dotnet/etc.) was loaded by the production agent, defeating
  the --allowed-tools whitelist
- Subscription rate quota mingled between human-erik's interactive Claude
  Code use and the production assistant
- Theoretical access to /home/erik/.ssh, .bash_history, .gitconfig

Now:
- User=overlord-agent (system account, no shell, /var/lib/overlord-agent home)
- HOME=/var/lib/overlord-agent — claude state fully isolated from erik
- /home/erik/.claude permissions tightened to 0700 (was 0755)
- group=overlord-agent on the repo + /etc/overlord/agent.env (read-only)

Project settings:
- New strict committed .claude/settings.json: deny Bash/Read/Write/Edit/
  Glob/Grep/NotebookEdit/WebSearch; allow only WebFetch(domain:acpedia.org)
- .claude/settings.local.json now gitignored (was leaking dev permissions
  to the server through the deploy)
2026-04-25 21:50:57 +02:00
Erik
49ae4369e0 fix(agent): relax SystemCallFilter — Node needs @cpu-emulation etc.
The extra ~@cpu-emulation ~@obsolete ~@swap ~@raw-io negations on top of
@system-service killed Claude Code (Node) with SIGSYS during startup.

Keep just the truly dangerous groups blocked: ~@privileged ~@reboot
~@mount. The base @system-service preset already excludes others (no
@debug, no @resources, etc. are included by default in that preset).
2026-04-25 21:31:14 +02:00
Erik
5cf052cedf fix(agent): drop MemoryDenyWriteExecute — breaks Node.js V8 JIT
Claude Code is a Node app. V8 JIT requires W^X transitions via mprotect
with PROT_EXEC on JIT'd code pages. MemoryDenyWriteExecute kills the
process with SIGTRAP/abort during startup (~10ms in).

Without JIT we'd have to use --jitless mode, which destroys performance.
The other systemd hardening (ProtectSystem, ProtectHome,
InaccessiblePaths, NoNewPrivileges, capability drop, syscall filter,
PrivateTmp, etc.) still gives strong filesystem and privilege isolation.
The remaining shellcode-injection risk is theoretical — there is no
Bash/Write/Edit tool exposed for an attacker to chain into.

Also: MemoryLimit -> MemoryMax (deprecated unit form).
2026-04-25 21:29:16 +02:00
Erik
9d4c724b7f feat(agent): security hardening — systemd lockdown, rate limit, audit log
systemd unit now applies defense-in-depth:
- ProtectSystem=strict + ProtectHome=read-only (rest of FS sealed)
- ReadWritePaths only for ~/.claude (session JSONLs) and venv + audit log
- InaccessiblePaths blocks /etc/shadow, /etc/ssh, /root, ~/.ssh, shell history
- NoNewPrivileges + dropped capabilities (no setuid escalation, no caps)
- PrivateTmp, PrivateDevices, ProtectKernel*, MemoryDenyWriteExecute
- SystemCallFilter @system-service ~@privileged ~@debug ~@mount etc.
- RestrictAddressFamilies blocks raw/packet sockets

Application layer:
- Per-user rate limit 60/hour (configurable via AGENT_RATE_MAX)
- Per-user concurrency cap of 1 in-flight (no parallel claude burns)
- JSONL audit log of every /agent/ask to /var/log/overlord-agent/audit.jsonl
  Logs username, message preview, result preview, timing, errors.

Plus secrets migration: EnvironmentFile now prefers /etc/overlord/agent.env
(root:erik 0640) over /home/erik/MosswartOverlord/.env, so even the
read-only /home doesn't expose them. Falls back to old path during
transition.
2026-04-25 21:25:40 +02:00
Erik
4ae18536be feat(agent): cross-char search_items tool + bump timeouts
Adds an MCP tool wrapping the inventory-service /search/items endpoint
with include_all_characters=true, so questions like 'find me a bracelet
with Legendary Acid Ward on any unequipped char' resolve in ONE tool call
instead of looping get_inventory over 60+ chars (which timed out at 120s).

- agent/tools.py: search_items_global wrapper
- agent/mcp_overlord.py: register new tool with detailed schema doc
- agent/claude_wrapper.py: include in --allowed-tools whitelist;
  bump timeout 120s -> 240s
- nginx/overlord.conf: bump /api/agent/ proxy timeout 180s -> 300s
- CLAUDE.md: brief Claude to USE search_items for cross-char searches
2026-04-25 21:13:26 +02:00
Erik
d3943e894c fix(agent): SECURITY — replace bypassPermissions with dontAsk
bypassPermissions ignores --allowed-tools entirely (per
permission-modes.md docs). With it, the model could call Bash, Write,
Edit, Read, etc. — confirmed by writing /tmp/owned.sh in a test.

dontAsk is the correct production headless mode: auto-DENIES anything
outside the --allowed-tools whitelist instead of prompting. Without
this, our entire MCP whitelist was effectively useless.
2026-04-25 21:05:53 +02:00
Erik
6d5819d297 fix(agent): use --resume on existing sessions, --session-id only for new
Claude Code rejects --session-id on a session that already exists on disk
('Session ID ... is already in use'). The first message of a conversation
must use --session-id to create; every message after must use --resume.

Detect by checking ~/.claude/projects/<encoded-cwd>/<uuid>.jsonl. Plus a
belt-and-suspenders retry: if --session-id surprisingly fails with the
'already in use' string, automatically retry with --resume.

This was the bug that caused chat windows to fail on the second message.
2026-04-25 20:51:46 +02:00
Erik
0745aefdb9 fix(auth): trust internal Docker/loopback connections in AuthMiddleware
Same pattern we already use for /ws/live (host-side Discord bot bypass).
Lets the new overlord-agent service call any tracker HTTP endpoint
without forging a session cookie. Safe because port 8765 is bound to
127.0.0.1 in docker-compose.yml — only the host or other compose-network
containers can reach it.
2026-04-25 20:47:47 +02:00
Erik
a3353e572d fix(agent): whitelist MCP tools + bypass permissions for unattended service 2026-04-25 20:46:42 +02:00
Erik
64523c4e97 fix(agent): point .mcp.json at venv python so MCP deps resolve 2026-04-25 20:45:52 +02:00
Erik
79cf88d3f7 feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess
Adds an in-dashboard AI assistant that answers questions about live game
state. Designed reactively (no background loops) — every user message in
the chat window or via /api/agent/ask runs one `claude -p` invocation.

Architecture:
- New host-side FastAPI service (agent/) on 127.0.0.1:8767, OUTSIDE the
  dereth-tracker Docker container because `claude` and ~/.claude
  credentials live on the host.
- nginx routes /api/agent/* to the host service.
- The same browser session cookie the tracker issues authenticates
  agent requests (shared SECRET_KEY).
- The agent shells out to `claude -p --session-id <uuid>` with
  cwd=/home/erik/MosswartOverlord. Sessions persist as JSONL on disk
  via Claude Code's built-in machinery.
- An MCP stdio server (agent/mcp_overlord.py) exposes tools to Claude:
  get_live_players, get_recent_rares, query_telemetry_db (read-only,
  parsed by sqlglot to reject DML/DDL), get_player_state, get_inventory,
  get_inventory_search, get_combat_stats, get_equipment_cantrips,
  get_quest_status, get_server_health, suitbuilder_search.
- Read-only PG role (overlord_agent_ro) is the second line of defense
  on the SQL tool — even a parser bypass can't mutate.

Frontend:
- AgentWindow.tsx — draggable chat window with localStorage-pinned
  session UUID, "New Chat" button, on-mount rehydration from
  /agent/sessions/{id}/history (parses Claude Code's JSONL).
- Wired into WindowRenderer + Sidebar (🤖 Assistant button).

Operational:
- systemd unit (overlord-agent.service) + install.sh.
- agent/README.md documents env vars, deploy flow, smoke tests.
- nginx/overlord.conf gets a new /api/agent/ location with 180s timeout.
- CLAUDE.md gets an "Overlord Assistant Mode" section briefing the
  agent on which tools to use and how to behave.

NOT YET DEPLOYED — server still needs:
1. Apply agent/sql/0001_overlord_agent_ro.sql + ALTER ROLE password
2. Add AGENT_DB_DSN to /home/erik/MosswartOverlord/.env
3. bash agent/install.sh (creates venv, installs unit, starts service)
4. sudo cp /home/erik/MosswartOverlord/nginx/overlord.conf to nginx + reload

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 20:43:59 +02:00
Erik
aeddaf9925 fix(ws): per-character lock for inventory_delta to prevent FK race
The previous commit moved inventory_delta handling to fire-and-forget
asyncio tasks. That removed the WS-loop blockage but introduced a race:
when the same character generated multiple deltas in quick succession
(mana burn, ID refresh, loot bursts), the tasks ran concurrently and
inventory-service's DELETE-then-INSERT path raced on the items table:

  asyncpg.exceptions.ForeignKeyViolationError:
  update or delete on table 'items' violates foreign key constraint
  'item_combat_stats_item_id_fkey'

The 500 errors caused inventory_delta updates to be dropped silently
(likely the source of the 'items in wrong container' bug the user
reported earlier — every delta returning 500 means the DB never updates).

Fix: per-character asyncio.Lock — deltas for the same character serialize,
deltas for different characters still run in parallel. Restores correctness
without losing the non-blocking-WS-loop benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:47:59 +02:00
Erik
e512c1c296 fix(ws): non-blocking inventory_delta + better disconnect handling
Two issues causing plugin WS disconnects on heavy-loot characters:

1. inventory_delta processing was awaiting an httpx POST to inventory-
   service inline within the WS receive loop. Each delta also created a
   fresh httpx.AsyncClient (no connection pool reuse). When inventory-
   service was slow under load, the receive loop blocked, keepalives
   stopped flowing, and the connection eventually dropped (especially
   for characters spamming deltas: Elliot was reconnecting ~every 4 min).

   Fix: process each delta as an asyncio.create_task() — the WS receive
   loop returns immediately to read the next message. Use a shared
   httpx.AsyncClient with connection pooling.

2. websocket.receive_text() raises RuntimeError ("Need to call accept
   first") instead of WebSocketDisconnect in some race conditions when
   the connection closes mid-await. The receive loop only caught
   WebSocketDisconnect, so RuntimeError propagated up as an exception
   traceback in logs.

   Fix: catch RuntimeError and log as a clean disconnect.

Also: log close code/reason on WebSocketDisconnect so we can tell apart
clean closes (1000/1001) from network drops (1006) etc.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:36:01 +02:00
Erik
f111e5063b ops: add nginx site config to repo as source-of-truth
The live host nginx config (/etc/nginx/sites-enabled/overlord) was not
tracked in git, leading to drift. This commit checks in a source-of-truth
copy under nginx/overlord.conf with a deploy procedure documented at the
top of the file.

Includes the proxy_read_timeout/proxy_send_timeout 1d settings for both
WebSocket location blocks (/websocket/ and /). Without these, nginx's
default 60s timeout drops idle plugin connections in a reconnect loop —
the symptom users saw was "WebSocket error … State: Aborted" every
~60s on idle characters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:32:33 +02:00
Erik
3c634adbdc docs: rewrite README to reflect current architecture
Full rewrite covering:
- React v2 frontend at /, classic v1 preserved at /classic
- WebSocket message-type subscription mechanism (bot filter fix)
- Death + idle alerts via Discord webhook with 5-min grace period
- spawn_events now a TimescaleDB hypertable with 7-day retention
- server_health_checks removed (write-only bloat)
- PostgreSQL memory tuning (shared_buffers 8GB on 32GB host)
- Uvicorn runs without --reload in production
- deploy-frontend.sh requirement for React builds
- Combat stats (Mag-Tools style), vital sharing, all WS event types
- Cross-machine vital sharing via WebSocket relay
- Deploy flows (quick / frontend / full rebuild)
- BUILD_VERSION CalVer stamp format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 14:55:30 +02:00
Erik
f7f04d6a84 Revert floating badge, remove debug logs
The floating version badge scrolled awkwardly and wasn't necessary
now that the bind-mount/deploy issue is fixed. The existing ml-version
inside the Sidebar is sufficient.

Also removed the temporary [INV_DEBUG] console logs from useLiveData
and InventoryWindow — the inventory live-update bug is confirmed fixed.
Kept the per-character inventoryVersions fix and the cache-buster on
the refetch URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:20:24 +02:00
Erik
73e204c82b Properly deploy React build via deploy-frontend.sh
The previous commits built into static/_build/ but forgot to run the
deploy script that copies the output to static/index.html and
static/assets/. The web server serves from static/, so none of the
previous frontend changes (per-character inventory version, debug logs,
version badge) were actually reaching the browser.

This commit runs deploy-frontend.sh which copies _build/ → static/,
replacing the stale index-BHGeM5hq bundle with the current one.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:16:35 +02:00
Erik
a5ff228d4f Add floating version badge in top-left corner
Small yellow badge fixed at position (4, 4) showing the running build
version. Helps visually confirm which bundle a browser is loading when
diagnosing cache issues.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:14:43 +02:00
Erik
0ff396cd0e Add debug logging for inventory live-update tracing + cache-bust fetch
Temporary instrumentation to diagnose why InventoryWindow doesn't refresh
on inventory_delta. Three log points:
- useLiveData: logs when inventory_delta arrives and version bump
- InventoryWindow effect: logs every run with state
- InventoryWindow fetch: logs when debounce fires and result count

Also added cache-buster (_t=timestamp) to the refetch URL in case HTTP
caching is masking fresh data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:05:51 +02:00
Erik
d26f1f725c Fix inventory window never refreshing live (per-character version)
The inventoryVersion counter in useLiveData was a single global value that
bumped on every inventory_delta for any character. With 60+ active chars
all generating deltas, the global counter advanced multiple times per
second.

InventoryWindow's debounce effect watched this global counter, so every
bump reset its 2-second fetch timer. Since bumps arrived faster than 2s,
the fetch never fired — the window appeared frozen until the user closed
and reopened it (which triggered the initial-fetch effect).

Fix: make inventoryVersions a Map<string, number> keyed by character name.
Each inventory_delta now only bumps its own character's counter, so an
open window's debounce correctly fires 2s after its character's last
delta, ignoring unrelated traffic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:57:55 +02:00
Erik
7dc5996820 Fix PostgreSQL shared_buffers (was 96GB on 32GB host)
timescaledb-tune had configured shared_buffers=96396MB — three times the
physical RAM of the host. The kernel was giving PG everything it could
(~30GB of shared memory), leaving <100MB free for everything else.
This caused the OS page cache to be constantly evicted, every query to
hit disk, and telemetry writes to balloon to 20+ seconds.

New settings (standard 25/50 rule for 32GB):
- shared_buffers: 96GB → 8GB
- effective_cache_size: 16GB (query planner hint)
- work_mem: 16MB per operation
- maintenance_work_mem: 1GB (for vacuum/index)
- max_wal_size: 4GB

Requires a db container restart to take effect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:11:35 +02:00
Erik
926e912c57 Server load optimization: spawn_events retention + log spam fix
Database cleanup:
- Converted spawn_events to a TimescaleDB hypertable with 7-day retention.
  Previously a regular table growing unbounded — had reached 482M rows/66GB
  from June 2025. Manual migration copied last 7 days (12M rows) to a new
  hypertable, swapped names, and dropped the old table.
  Result: DB shrunk from 77GB → 12GB.
- Dropped server_health_checks table entirely. It was write-only (850K rows,
  134MB) — only current state in server_status is actually read. Eliminated
  the insert from monitor_server_health().

Telemetry handler cleanup:
- Removed 4 per-message INFO log lines (TELEMETRY_RECEIVED, DB_WRITE_ATTEMPT,
  DB_WRITE_SUCCESS, PROCESSING_COMPLETE). At 60+ chars × every 2s = hundreds
  of log lines/sec. Replaced with single SLOW_* warnings above 500ms/1000ms
  thresholds.
- Removed redundant pool-size introspection (try/except + hasattr) on every
  telemetry message — useless noise in the hot path.
- Removed debug cache-miss and kill-delta logs.

Log level:
- docker-compose.yml: dereth-tracker LOG_LEVEL DEBUG → INFO (was dumping
  entire inventory_delta JSON payloads for every item update).
- inventory-service LOG_LEVEL DEBUG → INFO.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:08:51 +02:00
Erik
de2cc3a0e3 Add WS message filtering, idle grace period, webhook env var
- Browser WS clients can now send {"type": "subscribe", "message_types": [...]}
  to only receive specific message types. Default is all (no change for browsers).
- Discord bot subscribes to only "rare" and "chat" — eliminates 82GB+ of
  unnecessary telemetry/vitals/inventory traffic.
- Idle detection now has a 5-minute grace period before firing Discord alerts,
  preventing false positives on brief idle states.
- Added DISCORD_ACLOG_WEBHOOK env var to docker-compose.yml for death/idle alerts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:08:55 +02:00
Erik
3885b408c9 Remove --reload from production uvicorn CMD
The --reload flag without watchfiles installed causes uvicorn to
fall back to a polling-based file watcher that busy-loops at 100% CPU.
This was burning an entire core 24/7 doing nothing useful in production.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:45:43 +02:00
Erik
475c7aba03 feat: harder shake then spin 🌀🎵
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:25:13 +02:00
Erik
30c4067c99 feat: easter egg 🎵
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:22:44 +02:00
Erik
356c0d97b9 fix: allow internal Docker connections to /ws/live without auth
The Discord bot connects to /ws/live from the Docker internal network
(172.x.x.x) but has no session cookie, causing 4401 auth failures.

Now: connections from Docker internal network (172.x.x.x), localhost,
or ::1 skip the session cookie check. External connections (through
Nginx) still require authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 21:10:02 +02:00
Erik
adb9d5feab feat: major cleanup + death alerts + idle detection + Discord webhooks
Cleanup:
- Removed 109 stale asset files from static/assets/ (was 122, now 13)
- Removed static/v2/ entirely (was duplicate of root assets)
- Removed dead dashboard code: DashboardView, Layout, GlobalStats,
  CharacterCard, CharacterGrid, VitalBar, TabContainer, CombatTab,
  RaresTab, MapTab, InventoryTab, global.css, MapTransformContext
- Removed recharts dependency (425KB chunk eliminated)
- CSS reduced from 17KB to 10KB
- Added deploy-frontend.sh script for one-command build+deploy
- Updated CLAUDE.md with combat_stats, share_*, dungeon_map events
  and React frontend architecture

Death alerts (frontend + backend):
- Frontend: DeathNotification component with red banner + sawtooth
  sound when vitae goes from 0 to >0
- Backend: detects vitae transition in vitals handler, sends Discord
  webhook to #aclog with "☠️ CHARACTER died! (vitae: X%)"
- Rate-limited: max 1 Discord alert per character per 5 minutes

Idle detection (backend):
- Background task runs every 60 seconds
- Detects: vt_state "default"/"idle" OR kph=0 while in combat/hunt
- Sends Discord webhook: "⚠️ CHARACTER appears idle (state: X, KPH: 0)"
- Auto-clears alert when character becomes active again
- No duplicate alerts for same idle period

Discord integration:
- DISCORD_ACLOG_WEBHOOK env var for webhook URL
- Used by both death alerts and idle detection
- Graceful fallback when not configured

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:32:14 +02:00
Erik
d2c30b610b fix(v2): character window now updates live from WebSocket
The CharacterWindow only fetched once from API on mount and never
updated. Now:
- character_stats WS messages are tracked in useLiveData via ref
- Passed through WindowRenderer to CharacterWindow as liveStats prop
- Window uses live WS data when available, falls back to API fetch
- Attributes, skills, vitals base values, properties (augmentations,
  ratings, etc.), allegiance all update in real-time

Also: vitals bars in the character window use live WS vitals data
(health_percentage etc.) for real-time HP/Stamina/Mana display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:02:49 +02:00
Erik
a5bd659876 feat(v2): remove old dashboard, add vitae + resizable windows
- Removed old Recharts dashboard view entirely (no more viewMode
  toggle, DashboardView lazy import, Ctrl+D shortcut)
- Recharts chunk eliminated from build — bundle size reduced
- Player Dashboard window: added Vitae column (red when > 0%)
- ALL windows now resizable: drag bottom-right corner handle
  (min 300×200px). Subtle diagonal line grip indicator.
- Sidebar: removed 📊 Dashboard toggle link, removed broken
  /quest-status.html external link (replaced by 📜 Quests window)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:33:07 +02:00
Erik
938421999a feat(v2): Quest Status + Player Dashboard as React windows
Quest Status window (📜 Quests in sidebar):
- Fetches GET /quest-status API (polls every 30s)
- Grid: characters as rows × all unique quests as columns
- "READY" shown in green, countdowns in yellow, missing as dash
- Quest names shortened (removes "Timer", "Pickup" suffixes)
- Sticky header row, scrollable body
- Replaces broken quest-status.html link

Player Dashboard window (👥 Dashboard in sidebar):
- Sortable table of all online characters
- Columns: Character, State, KPH, Session kills, Total kills,
  Rares (total + session), Deaths, Uptime, HP%, Tapers
- Click column headers to sort (ascending/descending toggle)
- State badges: green=combat/hunt, red=other, gray=idle
- KPH in green, rares in gold, deaths in red (if > 0)
- HP% color-coded: green >80%, yellow >40%, red below

Sidebar changes:
- Removed broken /quest-status.html external link
- Added 👥 Dashboard + 📜 Quests as window opener buttons
- Both lazy-loaded (only fetched when first opened)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:02:00 +02:00
Erik
27caa21a56 style(v2): hide sidebar scrollbar between player column and map
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:58:09 +02:00
Erik
1a7300df37 style(v2): hide scrollbar on dashboard main area
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:56:25 +02:00
Erik
666af817a2 fix: add missing useRef import to InventoryWindow
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:54:51 +02:00
Erik
4638e60043 fix(v2): inventory no longer flickers — debounced re-fetch, no loading flash
inventory_delta WS messages were triggering immediate full re-fetch
with setLoading(true), causing content to flash blank. Now:
- Initial load shows loading state (once)
- Subsequent deltas debounced to 2s (batches rapid changes)
- Re-fetch runs silently without clearing existing items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:54:06 +02:00
Erik
0112c59514 feat(v2): 13 improvements — functional, visual, UX, backend
Functional:
1. Chat: "▼ New messages below" indicator when scrolled up, click to jump
2. Combat stats: "Clear Session" button (red, with confirm dialog)
3. Inventory: live updates via inventory_delta WS (re-fetches on change)
4. Inventory: real mana time from equipment_cantrip_state WS (live
   countdown with state dot: green=active, red=inactive, yellow=unknown)

Visual:
5. Thin separator line between tool links and sort buttons
6. Selected player row highlighted with darker background (#2a3344)
7. Scroll-to-top button (▲) appears when scrolled past 200px in player list

UX:
8. Double-click player dot on map opens their chat window
9. Right-click player dot shows context menu (Chat/Stats/Inv/Char/Combat/Radar)
10. Ctrl+D keyboard shortcut toggles between map and dashboard views
11. Sound notification on rare drops (880Hz sine beep via Web Audio API)

Backend:
12. Deep-merge lifetime offense/defense per element — accumulates
    total_attacks, failed_attacks, crits, damage per AttackType×Element
    instead of overwriting with latest session data
13. Startup cleanup: deletes stale combat_stats records from before
    the lifetime fix (pre-2026-04-14T09:00Z)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 13:49:40 +02:00
Erik
0b64c6ccff feat(v2): chat command history + smart auto-scroll
Command history:
- Up/Down arrow keys browse sent command history (like bash/console)
- 50 commands stored per character in localStorage
- Persists across page reloads and browser sessions
- Current input preserved when browsing (restored on Down past end)
- Duplicates kept (matches user preference)

Smart auto-scroll:
- New messages only auto-scroll if user is already at the bottom
- If user has scrolled up to read history, it stays put
- Sending a message snaps back to bottom
- 30px threshold for "at bottom" detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:51:45 +02:00
Erik
8a2d0e1a72 style(v2): amber/yellow meta states now show red instead
Non-active, non-idle VTank states (nav, turn_in_quests, etc.) now
display in red instead of amber/yellow in both:
- Map sidebar: .ml-meta-pill.other (red background + text)
- Dashboard cards: .badge-other (red background + text)

Green = combat/hunt, Red = nav/other states, Gray = idle/default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:36:23 +02:00
Erik
9f7686681b feat: v2 React frontend is now primary at /
- v1 vanilla JS frontend moved to /classic (static/classic/)
- v2 React app now serves at / (root)
- Vite base changed from /v2/ to /
- Assets at /assets/, service worker at /sw.js
- /classic still works — all v1 files preserved with relative paths
- /v2 still works as before (build output unchanged)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:17:23 +02:00
Erik
69678a9426 perf(v2): 8 optimizations — 24% smaller bundle, fewer re-renders
1. React.memo on WindowRenderer — prevents re-renders when parent
   state changes but no windows are affected

2. Coordinate display via direct DOM ref — no React state updates
   on mouse move (was triggering re-renders on every pixel)

3. useDeferredValue for sidebar vitals + player list — React
   prioritizes map interactions over stat text updates

4. Chat messages in ref — stores in useRef instead of useState,
   only bumps a version counter for re-render. Eliminates a
   new Map() allocation on every chat message.

5. Lazy-load 8 window components — InventoryWindow, CharacterWindow,
   RadarWindow, CombatStatsWindow, IssuesWindow, VitalSharingWindow,
   StatsWindow, CombatPickerWindow all loaded on first open.
   Main bundle dropped from 278KB to 211KB (24% reduction).

6. Preload critical assets — dereth.png, backpack icon, dungeon_tiles.json
   via <link rel="preload"> in index.html for instant map render.

7. Bundle splitting — React runtime extracted to separate 12KB chunk
   (cached independently). Window components split into 8 chunks.
   Total: 13 chunks vs previous 2.

8. Service worker — caches map images, icon sprites, and dungeon tiles.
   Icon images cached on first fetch. Repeat page loads serve from
   cache instantly. Auto-cleans old cache versions.

Net result:
- Initial load: 211KB main + 17KB CSS (was 278KB + 17KB)
- React cached separately: 12KB
- Windows load on demand: 1-15KB each
- Dashboard with Recharts: 425KB (unchanged, still lazy)
- Map images/icons: cached by service worker after first load

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:11:08 +02:00
Erik
19d95a370f chore: update tsconfig build cache — working baseline
All features functional: map view, sidebar, player dots/trails/heatmap/portals,
draggable windows (chat/stats/inventory/character/radar/combat/issues/vitals),
session+lifetime combat stats, 60-color palette, rare notifications, dungeon
radar, version display. Performance: code-split Recharts, direct DOM pan/zoom,
deferred player list, memoized derived data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:06:50 +02:00
Erik
9611868266 fix(v2): remove nav links from dashboard header, move Map View button left
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:01:31 +02:00
Erik
1e125f7653 fix(v2): tool links open in new tab (target=_blank)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:57:00 +02:00