# Overlord Agent A small host-side Python service that gives Claude Code (running in headless mode) access to live Overlord data so it can answer questions from the dashboard chat window. ## Why a separate service? `dereth-tracker` runs in Docker. The `claude` CLI binary at `/home/erik/.local/bin/claude` depends on `~/.claude` credentials owned by user `erik` on the host. The tracker container can't invoke it. So this service runs **outside** Docker, listens on `127.0.0.1:8767`, and nginx routes `/api/agent/*` to it. It validates the same browser session cookie the tracker issues (shared `SECRET_KEY`) and shells out to `claude -p` with `cwd=/home/erik/MosswartOverlord`. ## Architecture ``` Browser ──nginx──┬─► /api/* ──► dereth-tracker (Docker, 8765) │ └─► /api/agent/* ──► overlord-agent (host, 8767) │ ├─► subprocess: claude -p ... │ │ │ └─► MCP stdio ──► mcp_overlord.py │ │ │ └─► HTTP loopback to tracker │ └─► asyncpg to dereth-db │ └─► validates "session" cookie ``` ## Files | File | What | |------|------| | `service.py` | FastAPI app (`/agent/health`, `/agent/sessions/new`, `/agent/ask`, `/agent/sessions/{id}/history`) | | `auth.py` | Session-cookie validation (mirrors `main.py:1013-1019`) | | `claude_wrapper.py` | `asyncio.create_subprocess_exec("claude", "-p", ...)` | | `tools.py` | Pure tool implementations (HTTP loopback + read-only DB) | | `mcp_overlord.py` | MCP stdio server registering tools for Claude Code | | `sql/0001_overlord_agent_ro.sql` | Read-only PG role for the SQL tool | | `overlord-agent.service` | systemd unit | | `install.sh` | One-shot installer (venv + pip install + systemd) | ## Required env vars (in repo-root `.env`) ``` SECRET_KEY= AGENT_DB_DSN=postgresql://overlord_agent_ro:@127.0.0.1:5432/dereth TRACKER_URL=http://127.0.0.1:8765 # optional, this is the default CLAUDE_BIN=/home/erik/.local/bin/claude # optional, this is the default CLAUDE_CWD=/home/erik/MosswartOverlord # optional, this is the default CLAUDE_TIMEOUT_S=120 # optional ``` ## First-time setup on the server 1. **Create the read-only DB role** (one-time): ```bash docker exec -i dereth-db psql -U postgres -d dereth \ < /home/erik/MosswartOverlord/agent/sql/0001_overlord_agent_ro.sql docker exec -it dereth-db psql -U postgres -d dereth \ -c "ALTER ROLE overlord_agent_ro PASSWORD '';" ``` 2. **Add `AGENT_DB_DSN`** to `/home/erik/MosswartOverlord/.env` with the password you just set. 3. **Run the installer**: ```bash cd /home/erik/MosswartOverlord bash agent/install.sh ``` 4. **Update nginx**: edit `/etc/nginx/sites-enabled/overlord` to add the `/api/agent/` location (already in `nginx/overlord.conf` in the repo — just `sudo cp` and reload). ## Day-to-day deploy After editing any agent file: ```bash # On dev: git push # On server: ssh erik@overlord.snakedesert.se cd /home/erik/MosswartOverlord git pull sudo systemctl restart overlord-agent journalctl -u overlord-agent -f # tail logs ``` For Python dependency changes: ```bash agent/.venv/bin/pip install -r agent/requirements.txt sudo systemctl restart overlord-agent ``` ## Smoke tests ```bash # 1. Service alive? curl http://127.0.0.1:8767/agent/health # 2. Cookie required? curl -X POST http://127.0.0.1:8767/agent/ask \ -H 'Content-Type: application/json' \ -d '{"session_id":"x","message":"hi"}' # ⇒ 401 # 3. Direct claude invocation works? echo "hello" | /home/erik/.local/bin/claude -p \ --session-id 11111111-1111-1111-1111-111111111111 \ --output-format json # 4. End-to-end via nginx (with cookie): curl -X POST https://overlord.snakedesert.se/api/agent/ask \ -b 'session=' \ -H 'Content-Type: application/json' \ -d '{"session_id":"","message":"How many characters are online?"}' ``` ## Cost / rate-limit notes - Each `/agent/ask` shells out to `claude -p` once. - We use the user's Claude subscription (no API key) — flat-rate, no per-call billing, but subscription-tier rate limits still apply. - **Reactive only**: there are no background loops or periodic ticks. Each user message = one Claude turn (which may chain several tool calls internally before producing a final answer). - The SQL tool is hard-capped at 10s and 200 rows. - `suitbuilder_search` is the only tool that can take minutes; nginx read timeout is 180s for `/api/agent/`. ## Adding a new MCP tool 1. Implement `async def my_tool(...) -> dict` in `tools.py`. 2. Register it in `mcp_overlord.py` under `TOOL_DEFS`: - description (the agent reads this to decide when to call) - JSON schema for arguments - lambda dispatching to `T.my_tool(...)` 3. `sudo systemctl restart overlord-agent`. Claude Code re-discovers the tool list on each invocation.