MosswartOverlord/CLAUDE.md
Erik 79cf88d3f7 feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess
Adds an in-dashboard AI assistant that answers questions about live game
state. Designed reactively (no background loops) — every user message in
the chat window or via /api/agent/ask runs one `claude -p` invocation.

Architecture:
- New host-side FastAPI service (agent/) on 127.0.0.1:8767, OUTSIDE the
  dereth-tracker Docker container because `claude` and ~/.claude
  credentials live on the host.
- nginx routes /api/agent/* to the host service.
- The same browser session cookie the tracker issues authenticates
  agent requests (shared SECRET_KEY).
- The agent shells out to `claude -p --session-id <uuid>` with
  cwd=/home/erik/MosswartOverlord. Sessions persist as JSONL on disk
  via Claude Code's built-in machinery.
- An MCP stdio server (agent/mcp_overlord.py) exposes tools to Claude:
  get_live_players, get_recent_rares, query_telemetry_db (read-only,
  parsed by sqlglot to reject DML/DDL), get_player_state, get_inventory,
  get_inventory_search, get_combat_stats, get_equipment_cantrips,
  get_quest_status, get_server_health, suitbuilder_search.
- Read-only PG role (overlord_agent_ro) is the second line of defense
  on the SQL tool — even a parser bypass can't mutate.

Frontend:
- AgentWindow.tsx — draggable chat window with localStorage-pinned
  session UUID, "New Chat" button, on-mount rehydration from
  /agent/sessions/{id}/history (parses Claude Code's JSONL).
- Wired into WindowRenderer + Sidebar (🤖 Assistant button).

Operational:
- systemd unit (overlord-agent.service) + install.sh.
- agent/README.md documents env vars, deploy flow, smoke tests.
- nginx/overlord.conf gets a new /api/agent/ location with 180s timeout.
- CLAUDE.md gets an "Overlord Assistant Mode" section briefing the
  agent on which tools to use and how to behave.

NOT YET DEPLOYED — server still needs:
1. Apply agent/sql/0001_overlord_agent_ro.sql + ALTER ROLE password
2. Add AGENT_DB_DSN to /home/erik/MosswartOverlord/.env
3. bash agent/install.sh (creates venv, installs unit, starts service)
4. sudo cp /home/erik/MosswartOverlord/nginx/overlord.conf to nginx + reload

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 20:43:59 +02:00

176 lines
No EOL
9.3 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Dereth Tracker is a real-time telemetry service for game world tracking. It's a FastAPI-based WebSocket and HTTP API service that ingests player position/stats data via plugins and provides live map visualization through a web interface.
## Key Components
### Main Service (main.py)
- WebSocket endpoint `/ws/position` receives telemetry and inventory events
- Routes inventory events to inventory service via HTTP
- Handles real-time player tracking and map updates
### Inventory Service (inventory-service/main.py)
- Separate FastAPI service for inventory management
- Processes inventory JSON into normalized PostgreSQL tables
- Provides search API with advanced filtering and sorting
- Uses comprehensive enum database for translating game IDs to readable names
### Database Architecture
- **Telemetry DB**: TimescaleDB for time-series player tracking data
- **Inventory DB**: PostgreSQL with normalized schema for equipment data
- `items`: Core item properties
- `item_combat_stats`: Armor level, damage bonuses
- `item_enhancements`: Material, item sets, tinkering
- `item_spells`: Spell names and categories
- `item_raw_data`: Original JSON for complex queries
## Memories and Known Bugs
* Fixed: Material names now properly display (e.g., "Gold Celdon Girth" instead of "Unknown_Material_Gold Celdon Girth")
* Fixed: Slot column shows "-" instead of "Unknown" for items without slot data
* Fixed: All 208 items in Larsson's inventory now process successfully (was 186 with 22 SQL type errors)
* Added: Type column in inventory search using object_classes enum for accurate item type classification
* Note: ItemType data is inconsistent in JSON - using ObjectClass as primary source for Type column
## Recent Fixes (September 2025)
### Portal Coordinate Rounding Fix ✅ RESOLVED
* **Problem**: Portal insertion failed with duplicate key errors due to coordinate rounding mismatch
* **Root Cause**: Code used 2 decimal places (`ROUND(ns::numeric, 2)`) but database constraint used 1 decimal place
* **Solution**: Changed all portal coordinate checks to use 1 decimal place to match DB constraint
* **Result**: 98% reduction in duplicate key errors (from 600+/min to ~11/min)
* **Location**: `main.py` lines ~1989, 1996, 2025, 2047
### Character Display Issues ✅ RESOLVED
* **Problem**: Some characters (e.g., "Crazed n Dazed") not appearing in frontend
* **Root Cause**: Database connection pool exhaustion from portal error spam
* **Solution**: Fixed portal errors to reduce database load
* **Result**: Characters now display correctly after portal fix
### Docker Container Deployment
* **Issue**: Code changes require container rebuild with `--no-cache` flag
* **Command**: `docker compose build --no-cache dereth-tracker`
* **Reason**: Docker layer caching can prevent updated source code from being copied
## Current Known Issues
### Minor Portal Race Conditions
* **Status**: ~11 duplicate key errors per minute (down from 600+)
* **Cause**: Multiple players discovering same portal simultaneously
* **Impact**: Minimal - errors are caught and handled gracefully
* **Handling**: Try/catch in code logs as debug messages and updates portal timestamp
* **Potential Fix**: PostgreSQL ON CONFLICT DO UPDATE (upsert pattern) would eliminate completely
### Database Initialization Warnings
* **TimescaleDB Hypertable**: `telemetry_events` fails to become hypertable due to primary key constraint
* **Impact**: None - table works as regular PostgreSQL table
* **Warning**: "cannot create a unique index without the column 'timestamp'"
### Connection Pool Under Load
* **Issue**: Database queries can timeout when connection pool is exhausted
* **Symptom**: Characters may not appear during high error load
* **Mitigation**: Portal error fix significantly reduced this issue
## Equipment Suit Builder
### Status: PRODUCTION READY
Real-time equipment optimization engine for building optimal character loadouts by searching across multiple characters' inventories (mules). Uses Mag-SuitBuilder constraint satisfaction algorithms.
**Core Features:**
- Multi-character inventory search across 100+ characters, 25,000+ items
- Armor set constraints (primary 5-piece + secondary 4-piece set support)
- Cantrip/ward spell optimization with bitmap-based overlap detection
- Crit damage rating optimization
- Locked slots with set/spell preservation across searches
- Real-time SSE streaming with progressive phase updates
- Suit summary with copy-to-clipboard functionality
- Stable deterministic sorting for reproducible results
**Access:** `/suitbuilder.html`
**Architecture Details:** See `docs/plans/2026-02-09-suitbuilder-architecture.md`
### Known Limitations
- Slot-aware spell filtering not yet implemented (e.g., underclothes have limited spell pools but system treats all slots equally)
- All spells weighted equally (no priority/importance weighting yet)
- See architecture doc for future enhancement roadmap
## Technical Notes for Development
### Database Performance
- Connection pool: 5-20 connections (configured in `db_async.py`)
- Under heavy error load, pool exhaustion can cause 2-minute query timeouts
- Portal error fix significantly improved database performance
### Docker Development Workflow
1. **Code Changes**: Edit source files locally
2. **Rebuild**: `docker compose build --no-cache dereth-tracker` (required for code changes)
3. **Deploy**: `docker compose up -d dereth-tracker`
4. **Debug**: `docker logs mosswartoverlord-dereth-tracker-1` and `docker logs dereth-db`
### Frontend Architecture
- **Main Map**: `static/index.html` - Real-time player tracking
- **Inventory Search**: `static/inventory.html` - Advanced item filtering
- **Suitbuilder**: `static/suitbuilder.html` - Equipment optimization interface
- **All static files**: Served directly by FastAPI StaticFiles
### DOM Optimization Status ✅ COMPLETE (September 2025)
* **Achievement**: 100% DOM element reuse with zero element creation after initial render
* **Performance**: ~5ms render time for 69 players, eliminated 4,140+ elements/minute creation
* **Implementation**: Element pooling system with player name mapping for O(1) lookup
* **Monitoring**: Color-coded console output (✨ green = optimized, ⚡ yellow = partial, 🔥 red = poor)
* **Status**: Production ready - achieving perfect element reuse consistently
**Current Render Stats**:
- ✅ This render: 0 dots created, 69 reused | 0 list items created, 69 reused
- ✅ Lifetime: 69 dots created, 800+ reused | 69 list items created, 800+ reused
**Remaining TODO**:
- ❌ Fix CSS Grid layout for player sidebar (deferred per user request)
- ❌ Extend optimization to trails and portal rendering
- ❌ Add memory usage tracking
### WebSocket Endpoints
- `/ws/position`: Plugin telemetry, inventory, portal, rare events (authenticated)
- `/ws/live`: Browser client commands and live updates (unauthenticated)
---
## Overlord Assistant Mode
When invoked through the dashboard's chat window (the **🤖 Assistant** button) or through `/api/agent/ask`, you are acting as the **Overlord Assistant** — answering ad-hoc questions for the user about their live multi-account Asheron's Call setup.
**You have MCP tools** (from `.mcp.json`) for live game data. **Always use them** instead of guessing or apologising for not having data:
- `get_live_players` — current online characters with positions/kills/state
- `get_recent_rares` — rare item finds in the last N hours
- `query_telemetry_db` — read-only SQL on the telemetry DB for ad-hoc analysis
- (more tools added over time — call `list_tools` if unsure)
### Behaviour rules
1. **Use tools, don't speculate.** If the user asks "how many chars are online" — call `get_live_players`. Don't say "I'd need to check" — just check.
2. **Be concise.** The user is glancing at a chat window, not reading a report. 2-5 sentences for most answers. Use markdown tables for tabular data.
3. **No code unless asked.** This mode is about *operating* the system, not editing it. Don't open files or write code unless the user explicitly asks.
4. **Real numbers, real names.** Cite actual character names and counts from tools — never make up sample data.
5. **Read-only.** You cannot mutate the database; the SQL tool will reject any non-SELECT statement and the role is also `GRANT SELECT` only. If a question requires a write, say so.
6. **Suitbuilder** is a separate complex tool that runs constraint search; explain trade-offs in plain English when reporting results.
7. **Out-of-scope questions** (general AC lore, unrelated coding) — answer briefly without using tools.
### Available data tables (for `query_telemetry_db`)
- `telemetry_events` (hypertable, 30-day retention) — position/state snapshots every ~2s per character
- `rare_events` — rare item find log
- `spawn_events` (hypertable, 7-day retention) — monster spawn observations
- `portals` — discovered portal coords (1h dedup window)
- `char_stats`, `rare_stats`, `rare_stats_sessions` — lifetime/session aggregates
- `character_stats` — latest full stats JSON per character
- `combat_stats`, `combat_stats_sessions` — combat tracking
- `server_status` — current Coldeve game-server state (single row)
If asked about something not covered above, look in `db_async.py` for the schema or just try a query and report what you see.