MosswartOverlord/agent/README.md

# Overlord Agent

A small host-side Python service that gives Claude Code (running in
headless mode) access to live Overlord data so it can answer questions
from the dashboard chat window.

## Why a separate service?

`dereth-tracker` runs in Docker. The `claude` CLI binary at
`/home/erik/.local/bin/claude` depends on `~/.claude` credentials owned
by user `erik` on the host. The tracker container can't invoke it.

So this service runs **outside** Docker, listens on `127.0.0.1:8767`,
and nginx routes `/api/agent/*` to it. It validates the same browser
session cookie the tracker issues (shared `SECRET_KEY`) and shells out
to `claude -p` with `cwd=/home/erik/MosswartOverlord`.

## Architecture

```
Browser ──nginx──┬─► /api/*       ──► dereth-tracker (Docker, 8765)
                 │
                 └─► /api/agent/* ──► overlord-agent  (host, 8767)
                                          │
                                          ├─► subprocess: claude -p ...
                                          │       │
                                          │       └─► MCP stdio ──► mcp_overlord.py
                                          │                              │
                                          │                              └─► HTTP loopback to tracker
                                          │                              └─► asyncpg to dereth-db
                                          │
                                          └─► validates "session" cookie
```

## Files

| File | What |
|------|------|
| `service.py` | FastAPI app (`/agent/health`, `/agent/sessions/new`, `/agent/ask`, `/agent/sessions/{id}/history`) |
| `auth.py` | Session-cookie validation (mirrors `main.py:1013-1019`) |
| `claude_wrapper.py` | `asyncio.create_subprocess_exec("claude", "-p", ...)` |
| `tools.py` | Pure tool implementations (HTTP loopback + read-only DB) |
| `mcp_overlord.py` | MCP stdio server registering tools for Claude Code |
| `sql/0001_overlord_agent_ro.sql` | Read-only PG role for the SQL tool |
| `overlord-agent.service` | systemd unit |
| `install.sh` | One-shot installer (venv + pip install + systemd) |

## Required env vars (in repo-root `.env`)

```
SECRET_KEY=<same value the tracker uses to sign cookies>
AGENT_DB_DSN=postgresql://overlord_agent_ro:<password>@127.0.0.1:5432/dereth
TRACKER_URL=http://127.0.0.1:8765           # optional, this is the default
CLAUDE_BIN=/home/erik/.local/bin/claude     # optional, this is the default
CLAUDE_CWD=/home/erik/MosswartOverlord      # optional, this is the default
CLAUDE_TIMEOUT_S=120                        # optional
```

## First-time setup on the server

1. **Create the read-only DB role** (one-time):
   ```bash
   docker exec -i dereth-db psql -U postgres -d dereth \
       < /home/erik/MosswartOverlord/agent/sql/0001_overlord_agent_ro.sql
   docker exec -it dereth-db psql -U postgres -d dereth \
       -c "ALTER ROLE overlord_agent_ro PASSWORD '<random-password>';"
   ```
2. **Add `AGENT_DB_DSN`** to `/home/erik/MosswartOverlord/.env` with the
   password you just set.
3. **Run the installer**:
   ```bash
   cd /home/erik/MosswartOverlord
   bash agent/install.sh
   ```
4. **Update nginx**: edit `/etc/nginx/sites-enabled/overlord` to add the
   `/api/agent/` location (already in `nginx/overlord.conf` in the repo —
   just `sudo cp` and reload).

## Day-to-day deploy

After editing any agent file:

```bash
# On dev:
git push

# On server:
ssh erik@overlord.snakedesert.se
cd /home/erik/MosswartOverlord
git pull
sudo systemctl restart overlord-agent
journalctl -u overlord-agent -f   # tail logs
```

For Python dependency changes:

```bash
agent/.venv/bin/pip install -r agent/requirements.txt
sudo systemctl restart overlord-agent
```

## Smoke tests

```bash
# 1. Service alive?
curl http://127.0.0.1:8767/agent/health

# 2. Cookie required?
curl -X POST http://127.0.0.1:8767/agent/ask \
    -H 'Content-Type: application/json' \
    -d '{"session_id":"x","message":"hi"}'
#  ⇒ 401

# 3. Direct claude invocation works?
echo "hello" | /home/erik/.local/bin/claude -p \
    --session-id 11111111-1111-1111-1111-111111111111 \
    --output-format json

# 4. End-to-end via nginx (with cookie):
curl -X POST https://overlord.snakedesert.se/api/agent/ask \
    -b 'session=<your-session-cookie>' \
    -H 'Content-Type: application/json' \
    -d '{"session_id":"<uuid>","message":"How many characters are online?"}'
```

## Cost / rate-limit notes

- Each `/agent/ask` shells out to `claude -p` once.
- We use the user's Claude subscription (no API key) — flat-rate, no
  per-call billing, but subscription-tier rate limits still apply.
- **Reactive only**: there are no background loops or periodic ticks.
  Each user message = one Claude turn (which may chain several tool
  calls internally before producing a final answer).
- The SQL tool is hard-capped at 10s and 200 rows.
- `suitbuilder_search` is the only tool that can take minutes; nginx
  read timeout is 180s for `/api/agent/`.

## Adding a new MCP tool

1. Implement `async def my_tool(...) -> dict` in `tools.py`.
2. Register it in `mcp_overlord.py` under `TOOL_DEFS`:
   - description (the agent reads this to decide when to call)
   - JSON schema for arguments
   - lambda dispatching to `T.my_tool(...)`
3. `sudo systemctl restart overlord-agent`. Claude Code re-discovers the
   tool list on each invocation.