The agent service was running as User=erik, which meant:
- Sessions polluted erik's ~/.claude/projects/
- erik's .claude/settings.local.json (months of accumulated dev permissions
for docker/git/dotnet/etc.) was loaded by the production agent, defeating
the --allowed-tools whitelist
- Subscription rate quota mingled between human-erik's interactive Claude
Code use and the production assistant
- Theoretical access to /home/erik/.ssh, .bash_history, .gitconfig
Now:
- User=overlord-agent (system account, no shell, /var/lib/overlord-agent home)
- HOME=/var/lib/overlord-agent — claude state fully isolated from erik
- /home/erik/.claude permissions tightened to 0700 (was 0755)
- group=overlord-agent on the repo + /etc/overlord/agent.env (read-only)
Project settings:
- New strict committed .claude/settings.json: deny Bash/Read/Write/Edit/
Glob/Grep/NotebookEdit/WebSearch; allow only WebFetch(domain:acpedia.org)
- .claude/settings.local.json now gitignored (was leaking dev permissions
to the server through the deploy)
The extra ~@cpu-emulation ~@obsolete ~@swap ~@raw-io negations on top of
@system-service killed Claude Code (Node) with SIGSYS during startup.
Keep just the truly dangerous groups blocked: ~@privileged ~@reboot
~@mount. The base @system-service preset already excludes others (no
@debug, no @resources, etc. are included by default in that preset).
Claude Code is a Node app. V8 JIT requires W^X transitions via mprotect
with PROT_EXEC on JIT'd code pages. MemoryDenyWriteExecute kills the
process with SIGTRAP/abort during startup (~10ms in).
Without JIT we'd have to use --jitless mode, which destroys performance.
The other systemd hardening (ProtectSystem, ProtectHome,
InaccessiblePaths, NoNewPrivileges, capability drop, syscall filter,
PrivateTmp, etc.) still gives strong filesystem and privilege isolation.
The remaining shellcode-injection risk is theoretical — there is no
Bash/Write/Edit tool exposed for an attacker to chain into.
Also: MemoryLimit -> MemoryMax (deprecated unit form).
systemd unit now applies defense-in-depth:
- ProtectSystem=strict + ProtectHome=read-only (rest of FS sealed)
- ReadWritePaths only for ~/.claude (session JSONLs) and venv + audit log
- InaccessiblePaths blocks /etc/shadow, /etc/ssh, /root, ~/.ssh, shell history
- NoNewPrivileges + dropped capabilities (no setuid escalation, no caps)
- PrivateTmp, PrivateDevices, ProtectKernel*, MemoryDenyWriteExecute
- SystemCallFilter @system-service ~@privileged ~@debug ~@mount etc.
- RestrictAddressFamilies blocks raw/packet sockets
Application layer:
- Per-user rate limit 60/hour (configurable via AGENT_RATE_MAX)
- Per-user concurrency cap of 1 in-flight (no parallel claude burns)
- JSONL audit log of every /agent/ask to /var/log/overlord-agent/audit.jsonl
Logs username, message preview, result preview, timing, errors.
Plus secrets migration: EnvironmentFile now prefers /etc/overlord/agent.env
(root:erik 0640) over /home/erik/MosswartOverlord/.env, so even the
read-only /home doesn't expose them. Falls back to old path during
transition.
Adds an in-dashboard AI assistant that answers questions about live game
state. Designed reactively (no background loops) — every user message in
the chat window or via /api/agent/ask runs one `claude -p` invocation.
Architecture:
- New host-side FastAPI service (agent/) on 127.0.0.1:8767, OUTSIDE the
dereth-tracker Docker container because `claude` and ~/.claude
credentials live on the host.
- nginx routes /api/agent/* to the host service.
- The same browser session cookie the tracker issues authenticates
agent requests (shared SECRET_KEY).
- The agent shells out to `claude -p --session-id <uuid>` with
cwd=/home/erik/MosswartOverlord. Sessions persist as JSONL on disk
via Claude Code's built-in machinery.
- An MCP stdio server (agent/mcp_overlord.py) exposes tools to Claude:
get_live_players, get_recent_rares, query_telemetry_db (read-only,
parsed by sqlglot to reject DML/DDL), get_player_state, get_inventory,
get_inventory_search, get_combat_stats, get_equipment_cantrips,
get_quest_status, get_server_health, suitbuilder_search.
- Read-only PG role (overlord_agent_ro) is the second line of defense
on the SQL tool — even a parser bypass can't mutate.
Frontend:
- AgentWindow.tsx — draggable chat window with localStorage-pinned
session UUID, "New Chat" button, on-mount rehydration from
/agent/sessions/{id}/history (parses Claude Code's JSONL).
- Wired into WindowRenderer + Sidebar (🤖 Assistant button).
Operational:
- systemd unit (overlord-agent.service) + install.sh.
- agent/README.md documents env vars, deploy flow, smoke tests.
- nginx/overlord.conf gets a new /api/agent/ location with 180s timeout.
- CLAUDE.md gets an "Overlord Assistant Mode" section briefing the
agent on which tools to use and how to behave.
NOT YET DEPLOYED — server still needs:
1. Apply agent/sql/0001_overlord_agent_ro.sql + ALTER ROLE password
2. Add AGENT_DB_DSN to /home/erik/MosswartOverlord/.env
3. bash agent/install.sh (creates venv, installs unit, starts service)
4. sudo cp /home/erik/MosswartOverlord/nginx/overlord.conf to nginx + reload
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>