MosswartOverlord/agent
Erik 49ae4369e0 fix(agent): relax SystemCallFilter — Node needs @cpu-emulation etc.
The extra ~@cpu-emulation ~@obsolete ~@swap ~@raw-io negations on top of
@system-service killed Claude Code (Node) with SIGSYS during startup.

Keep just the truly dangerous groups blocked: ~@privileged ~@reboot
~@mount. The base @system-service preset already excludes others (no
@debug, no @resources, etc. are included by default in that preset).
2026-04-25 21:31:14 +02:00
..
sql feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
__init__.py feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
auth.py feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
claude_wrapper.py feat(agent): cross-char search_items tool + bump timeouts 2026-04-25 21:13:26 +02:00
install.sh feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
mcp_overlord.py feat(agent): cross-char search_items tool + bump timeouts 2026-04-25 21:13:26 +02:00
overlord-agent.service fix(agent): relax SystemCallFilter — Node needs @cpu-emulation etc. 2026-04-25 21:31:14 +02:00
README.md feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
requirements.txt feat(agent): Phase 1 — chat-window AI assistant via Claude Code subprocess 2026-04-25 20:43:59 +02:00
service.py feat(agent): security hardening — systemd lockdown, rate limit, audit log 2026-04-25 21:25:40 +02:00
tools.py feat(agent): cross-char search_items tool + bump timeouts 2026-04-25 21:13:26 +02:00

Overlord Agent

A small host-side Python service that gives Claude Code (running in headless mode) access to live Overlord data so it can answer questions from the dashboard chat window.

Why a separate service?

dereth-tracker runs in Docker. The claude CLI binary at /home/erik/.local/bin/claude depends on ~/.claude credentials owned by user erik on the host. The tracker container can't invoke it.

So this service runs outside Docker, listens on 127.0.0.1:8767, and nginx routes /api/agent/* to it. It validates the same browser session cookie the tracker issues (shared SECRET_KEY) and shells out to claude -p with cwd=/home/erik/MosswartOverlord.

Architecture

Browser ──nginx──┬─► /api/*       ──► dereth-tracker (Docker, 8765)
                 │
                 └─► /api/agent/* ──► overlord-agent  (host, 8767)
                                          │
                                          ├─► subprocess: claude -p ...
                                          │       │
                                          │       └─► MCP stdio ──► mcp_overlord.py
                                          │                              │
                                          │                              └─► HTTP loopback to tracker
                                          │                              └─► asyncpg to dereth-db
                                          │
                                          └─► validates "session" cookie

Files

File What
service.py FastAPI app (/agent/health, /agent/sessions/new, /agent/ask, /agent/sessions/{id}/history)
auth.py Session-cookie validation (mirrors main.py:1013-1019)
claude_wrapper.py asyncio.create_subprocess_exec("claude", "-p", ...)
tools.py Pure tool implementations (HTTP loopback + read-only DB)
mcp_overlord.py MCP stdio server registering tools for Claude Code
sql/0001_overlord_agent_ro.sql Read-only PG role for the SQL tool
overlord-agent.service systemd unit
install.sh One-shot installer (venv + pip install + systemd)

Required env vars (in repo-root .env)

SECRET_KEY=<same value the tracker uses to sign cookies>
AGENT_DB_DSN=postgresql://overlord_agent_ro:<password>@127.0.0.1:5432/dereth
TRACKER_URL=http://127.0.0.1:8765           # optional, this is the default
CLAUDE_BIN=/home/erik/.local/bin/claude     # optional, this is the default
CLAUDE_CWD=/home/erik/MosswartOverlord      # optional, this is the default
CLAUDE_TIMEOUT_S=120                        # optional

First-time setup on the server

  1. Create the read-only DB role (one-time):
    docker exec -i dereth-db psql -U postgres -d dereth \
        < /home/erik/MosswartOverlord/agent/sql/0001_overlord_agent_ro.sql
    docker exec -it dereth-db psql -U postgres -d dereth \
        -c "ALTER ROLE overlord_agent_ro PASSWORD '<random-password>';"
    
  2. Add AGENT_DB_DSN to /home/erik/MosswartOverlord/.env with the password you just set.
  3. Run the installer:
    cd /home/erik/MosswartOverlord
    bash agent/install.sh
    
  4. Update nginx: edit /etc/nginx/sites-enabled/overlord to add the /api/agent/ location (already in nginx/overlord.conf in the repo — just sudo cp and reload).

Day-to-day deploy

After editing any agent file:

# On dev:
git push

# On server:
ssh erik@overlord.snakedesert.se
cd /home/erik/MosswartOverlord
git pull
sudo systemctl restart overlord-agent
journalctl -u overlord-agent -f   # tail logs

For Python dependency changes:

agent/.venv/bin/pip install -r agent/requirements.txt
sudo systemctl restart overlord-agent

Smoke tests

# 1. Service alive?
curl http://127.0.0.1:8767/agent/health

# 2. Cookie required?
curl -X POST http://127.0.0.1:8767/agent/ask \
    -H 'Content-Type: application/json' \
    -d '{"session_id":"x","message":"hi"}'
#  ⇒ 401

# 3. Direct claude invocation works?
echo "hello" | /home/erik/.local/bin/claude -p \
    --session-id 11111111-1111-1111-1111-111111111111 \
    --output-format json

# 4. End-to-end via nginx (with cookie):
curl -X POST https://overlord.snakedesert.se/api/agent/ask \
    -b 'session=<your-session-cookie>' \
    -H 'Content-Type: application/json' \
    -d '{"session_id":"<uuid>","message":"How many characters are online?"}'

Cost / rate-limit notes

  • Each /agent/ask shells out to claude -p once.
  • We use the user's Claude subscription (no API key) — flat-rate, no per-call billing, but subscription-tier rate limits still apply.
  • Reactive only: there are no background loops or periodic ticks. Each user message = one Claude turn (which may chain several tool calls internally before producing a final answer).
  • The SQL tool is hard-capped at 10s and 200 rows.
  • suitbuilder_search is the only tool that can take minutes; nginx read timeout is 180s for /api/agent/.

Adding a new MCP tool

  1. Implement async def my_tool(...) -> dict in tools.py.
  2. Register it in mcp_overlord.py under TOOL_DEFS:
    • description (the agent reads this to decide when to call)
    • JSON schema for arguments
    • lambda dispatching to T.my_tool(...)
  3. sudo systemctl restart overlord-agent. Claude Code re-discovers the tool list on each invocation.