acdream/docs/superpowers/specs/2026-04-11-foundation-phase-design.md
Erik 2c1c784c8c docs: refresh strategic roadmap + Foundation phase design spec
Output of a brainstorming session after Phase 6/7.1/9.1/9.2 shipped
and the lifestone crystal bug was isolated. Two documents:

1. docs/plans/2026-04-11-roadmap.md — strategic roadmap replacing
   the stale post-Phase-5 version. Reflects what's actually shipped,
   reorganizes upcoming work into Phases A (Foundation), B (Gameplay),
   C (Polish — includes VFX/particles, dynamic lights, palette tuning,
   double-sided translucents), D (UI + Sound), and E (long-tail).
   Updates the "when will my complaint be fixed" quick-lookup with
   the correct phase for portals (VFX, not shader tricks as previously
   claimed), smoke, fireplace fire, and everything we fixed this
   session. Phase ordering: A → B → (C/D in parallel) → E.

2. docs/superpowers/specs/2026-04-11-foundation-phase-design.md —
   detailed implementation spec for Phase A only. Covers the four
   sub-pieces (streaming landblock loader, frustum culling, net I/O
   thread, async dat decoding folded into the streaming worker),
   their components, data flow, error handling, testing strategy,
   and commit-point ordering. Includes non-goals to prevent scope
   creep.

No code changes yet. The spec goes to user review next, then into
the writing-plans skill for a detailed implementation plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:43:33 +02:00

348 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase A — Foundation phase design
**Status:** Spec, 2026-04-11, brainstormed from zero.
**Scope:** Replace acdream's current one-shot 3×3 landblock preload with a streaming loader, add frustum culling, move UDP I/O off the render thread, and make dat reads happen on a background worker. These four sub-pieces ship together as Phase A.
**Parent:** `docs/plans/2026-04-11-roadmap.md` (strategic roadmap)
## Goals
1. Walk across 10+ landblocks in any direction with no crashes and no empty voids.
2. Landblock-boundary crossings produce no visible hitch — background loading hides itself behind a queue the render thread drains.
3. Frame time stays usable on a 5×5 default visible window.
4. Visible window is runtime-configurable via `ACDREAM_STREAM_RADIUS` so the user can tune against their hardware without rebuilding.
5. No packet drops when the render thread stalls — UDP receive runs on its own thread.
## Non-goals
- Per-entity frustum culling. Per-landblock coarse culling is enough for Phase A; per-entity is Phase C or later.
- LOD mesh levels. Distant trees still use full vertex counts — that's a Phase C polish concern.
- Dungeon landblock format (`0xAAAA0000` family). Only the surface `0xAAAA****` landblocks we already handle. Dungeons are Phase E.
- Memory reclamation across _all_ uploaded dat assets. We reference-count GfxObj uploads per landblock; palettes, animations, and other shared dat assets are not GC'd. They grow monotonically until application exit — acceptable because individual assets are small and the working set is bounded by the visible window.
- Live perf benchmarks / numeric acceptance. Acceptance is "no visible hitch" judged by eye, not "frame time < 5ms."
## High-level architecture
Four new components, one existing renderer modified, one existing net-session modified. No changes to `WorldEntity`, `MeshRef`, shaders, or the mesh builders.
```
┌──────────────┐ ┌──────────────┐
│ Net thread │ ── msg queue ──────────────────────────────> │ │
├──────────────┤ │ │
│ UDP.Receive │ │ │
│ Decode+Frag │ │ Render │
│ Parse msg │ │ thread │
└──────────────┘ │ │
│ OnUpdate │
┌──────────────┐ │ drain msgs │
│ Load thread │ │ compute │
├──────────────┤ ── completion outbox ──────────────────────> │ center │
│ pull job │ │ diff region │
│ read dats │ │ enqueue │
│ build meshes │ │ drain cmpl. │
│ (CPU-only) │ <── job queue ───────────────────────────────│ GPU upload │
└──────────────┘ │ │
│ OnRender │
│ cull │
│ draw opaque │
│ draw trans. │
└──────────────┘
```
## Components
### 1. `StreamingRegion` *(new, `src/AcDream.App/Streaming/StreamingRegion.cs`)*
A pure value type / data holder. Given a center landblock `(x, y)` and a radius `r`, produces the set of landblock IDs in the `(2r+1) × (2r+1)` window.
**API:**
```csharp
public sealed class StreamingRegion
{
public int CenterX { get; }
public int CenterY { get; }
public int Radius { get; }
public IReadOnlySet<uint> Visible { get; } // landblock IDs in `(lbX << 24) | (lbY << 16) | 0xFFFE`
public StreamingRegion(int cx, int cy, int radius);
// Diff-style recenter: returns the landblocks to load (new to visible set)
// and to unload (fell outside `radius + 1` hysteresis).
public RegionDiff RecenterTo(int newCx, int newCy);
}
public readonly record struct RegionDiff(
IReadOnlyList<uint> ToLoad,
IReadOnlyList<uint> ToUnload);
```
**Hysteresis:** unload only happens when a landblock falls further than `radius + 1` from the current center. Prevents load/unload churn at boundary crossings. Unit tests verify the hysteresis logic with a standing-still center (no loads, no unloads) and a one-step cross (one new load, no unloads because the departing row is still inside `r+1`).
No threading. Pure data. Test-able in isolation.
### 2. `LandblockStreamer` *(new, `src/AcDream.App/Streaming/LandblockStreamer.cs`)*
Owns a dedicated background thread + two channel-based queues (inbox for jobs, outbox for completions).
**API:**
```csharp
public sealed class LandblockStreamer : IDisposable
{
public LandblockStreamer(
DatCollection dats,
int centerX, int centerY, // for world-space offset computation inside completions
ILogger? log = null);
public void Start();
public void EnqueueLoad(uint landblockId);
public void EnqueueUnload(uint landblockId);
/// <summary>
/// Drains up to <paramref name="maxBatchSize"/> completed loads and returns
/// them to the caller. Non-blocking. Call from the render thread once per
/// OnUpdate. The caller is responsible for GPU upload and world-state
/// integration using the returned <see cref="LoadedLandblock"/> records.
/// </summary>
public IReadOnlyList<LoadedLandblock> DrainCompletions(int maxBatchSize = 4);
public void Dispose();
}
public sealed record LoadedLandblock(
uint LandblockId,
WorldView.LoadedLandblock Terrain,
IReadOnlyList<WorldEntity> Entities); // scenery + interior + stabs pre-flattened
```
The load thread pulls jobs from the inbox, invokes the existing `WorldView.LoadLandblock` terrain path (unchanged), the scenery generator (unchanged), the EnvCell walker (unchanged), and `CellMesh.Build` / `GfxObjMesh.Build` (unchanged, CPU-only) to produce a `LoadedLandblock` record. The record is posted to the outbox.
**Unloads** post an "unload" completion that tells the render thread which landblock's data to release. The render thread holds the authoritative `Dictionary<uint, LoadedLandblock>` and references the GPU buffers. On unload, the render thread removes entries and decrements reference counts on GfxObj GPU bundles.
**Thread safety:** jobs and completions go through `System.Threading.Channels.Channel<T>` (unbounded, single-reader single-writer). `DatCollection` reads are thread-safe per DatReaderWriter docs; no extra locking. Cancellation via `CancellationTokenSource`; `Dispose` cancels and joins the thread.
**Error handling:** if a dat read throws, the worker catches, logs the landblock ID + exception, and posts a `LoadFailed` completion. The controller marks the landblock as "failed" and does not retry until the region recenters past it and back.
### 3. `StreamingController` *(new, `src/AcDream.App/Streaming/StreamingController.cs`)*
Glue between `GameWindow`, `StreamingRegion`, and `LandblockStreamer`. Called once per frame from `OnUpdate`.
**API:**
```csharp
public sealed class StreamingController
{
public StreamingController(
LandblockStreamer streamer,
int initialRadius,
int initialCenterX, int initialCenterY);
public int Radius { get; set; } // ACDREAM_STREAM_RADIUS env var seeds this
/// <summary>
/// Called every frame. Updates the streaming region based on the current
/// observer position, enqueues loads/unloads as needed, and drains
/// completed loads into <paramref name="worldState"/>.
/// </summary>
public void Tick(
Vector3 observerWorldPosition,
GpuWorldState worldState);
}
```
`GpuWorldState` is a helper class (mutable) that owns the per-landblock GPU resources: the terrain renderer's uploaded landblocks, the static-mesh renderer's uploaded entities per landblock, and the per-landblock AABBs used by the frustum culler. The controller adds/removes entries in it as completions drain. It replaces the current flat `_entities` list `GameWindow` holds.
The observer position is converted to landblock coordinates by taking the integer landblock-sized chunks of world X/Y (we already have `_liveCenterX` and `_liveCenterY` offset math; reuse it). The center is:
- **live mode:** server-sent player position (latest `EntityPositionUpdate` for our own GUID), converted via the same lb-offset math as in `OnLiveEntitySpawned` / `OnLivePositionUpdated`
- **offline mode:** camera position
Selection between the two: a simple `_liveSession is { CurrentState: InWorld }` check.
**Hotkey / env var:** `ACDREAM_STREAM_RADIUS` read at startup. Changing it mid-run is not required for the MVP we can add a keybind later if it comes up.
### 4. `FrustumCuller` *(new, `src/AcDream.App/Rendering/FrustumCuller.cs`)*
Per-frame view-frustum extraction from a view×projection matrix, plus AABB-vs-frustum intersection.
**API:**
```csharp
public readonly struct FrustumPlanes
{
public readonly Vector4 Left, Right, Bottom, Top, Near, Far;
public static FrustumPlanes FromViewProjection(Matrix4x4 vp);
}
public static class FrustumCuller
{
/// <summary>
/// Returns true if <paramref name="aabb"/> is potentially visible against
/// <paramref name="planes"/>. Conservative — returns true for partial
/// intersections. Zero allocations; suitable for per-frame use.
/// </summary>
public static bool IsAabbVisible(FrustumPlanes planes, BoundingBox aabb);
}
```
Used by `StaticMeshRenderer.Draw` and (optionally) `TerrainRenderer.Draw`:
1. At the start of each draw call, extract `FrustumPlanes` from `camera.View * camera.Projection`.
2. For each landblock's AABB (precomputed at load time the terrain height range gives Z extent, the 192×192 landblock footprint gives X/Y extent), test against the frustum.
3. Skip entity iteration for any landblock whose AABB is fully outside. Opaque and translucent passes both benefit.
AABBs are owned by `GpuWorldState` and refreshed only when a new landblock loads. They never change at runtime.
### 5. `NetIoThread` *(modification to `src/AcDream.Core.Net/WorldSession.cs`)*
Move `WorldSession.PumpOnce` (or its eventual replacement) off the render thread.
**Before:**
```csharp
// called from GameWindow.OnUpdate
_liveSession.Tick(); // blocking receive with 250ms timeout, then ProcessDatagram
```
**After:**
```csharp
// started by Connect()
var netThread = new Thread(NetReceiveLoop) { IsBackground = true, Name = "acdream.net" };
netThread.Start();
private void NetReceiveLoop()
{
while (!_cancel.IsCancellationRequested)
{
var bytes = _net.Receive(TimeSpan.FromMilliseconds(250), out _);
if (bytes is null) continue;
// Decode + fragment assembly happens here too — everything CPU-bound
// runs off the render thread.
var parsed = DecodeAndParse(bytes);
foreach (var msg in parsed)
_incoming.Writer.TryWrite(msg);
}
}
// called from GameWindow.OnUpdate
public void Tick()
{
while (_incoming.Reader.TryRead(out var msg))
Dispatch(msg); // fires EntitySpawned / MotionUpdated / PositionUpdated
}
```
**Channel type:** `System.Threading.Channels.Channel<ParsedGameMessage>`, unbounded, MPSC (net thread writes, render thread reads). `TryWrite` is non-blocking.
**Event handlers stay on the render thread.** `EntitySpawned`, `MotionUpdated`, `PositionUpdated` are fired from `Tick()` (render thread), so existing handler code continues to see a single-threaded world and doesn't need locks. The `_entitiesByServerGuid` dict, `_animatedEntities`, GPU upload calls all still single-threaded.
**Shutdown:** `Dispose()` cancels `_cancel`, waits for the net thread via `netThread.Join(TimeSpan.FromSeconds(1))`, closes the UDP socket.
**ISAAC state:** `_inboundIsaac` and `_outboundIsaac` are owned by the net thread after handshake; `SendGameMessage` needs a lock if we keep it render-thread-owned, OR an outbound channel so the net thread drives all socket writes. The simpler option is a single `lock(_isaacOutboundLock)` around outbound sends the send volume is low and contention is negligible.
### Existing files touched
- **`GameWindow.cs`** replace the one-shot preload with `StreamingController.Tick` in `OnUpdate`. Remove the `_entities` field (migrates into `GpuWorldState`). Feed `FrustumPlanes` into `StaticMeshRenderer.Draw`.
- **`StaticMeshRenderer.cs`** accept a `FrustumPlanes?` parameter on `Draw`; if set, skip entity iteration for landblocks whose AABB fails the test. Entity-to-landblock mapping is provided by `GpuWorldState`.
- **`TerrainRenderer.cs`** same treatment: skip drawing landblocks that fail the frustum test.
- **`WorldSession.cs`** split the receive loop onto a background thread; add the channel; adjust `Tick` to drain the channel.
- **`Program.cs` / `GameWindow.cs` startup** read `ACDREAM_STREAM_RADIUS`, instantiate `LandblockStreamer` + `StreamingController`.
## Data flow (steady state, one frame)
1. **Net thread (continuous):** receives packet decodes fragment assembles parses writes `ParsedGameMessage` to `_incoming` channel.
2. **Load thread (continuous):** pulls job from `LandblockStreamer._jobs` reads dats builds meshes posts `LoadedLandblock` to `LandblockStreamer._completions`.
3. **OnUpdate (render thread):**
- `_liveSession.Tick()` drains `_incoming`, fires `EntitySpawned` / `MotionUpdated` / `PositionUpdated`. Handlers update `_entitiesByServerGuid`, `_animatedEntities`, world state.
- `_streamingController.Tick(observerPos, worldState)`:
- Computes observer landblock coordinates from world pos.
- `StreamingRegion.RecenterTo(...)` `RegionDiff`.
- Enqueues `ToLoad` into `LandblockStreamer`.
- Enqueues `ToUnload` into `LandblockStreamer`.
- Drains `LandblockStreamer.DrainCompletions()` GPU-uploads new terrain + meshes adds entities to `worldState`.
- Animation tick (`TickAnimations(dt)`) runs over the updated animated-entity set.
- Fly camera input + keyboard.
4. **OnRender (render thread):**
- Compute `FrustumPlanes` from camera.
- Draw terrain, skipping culled landblocks.
- Draw static mesh pass 1 (opaque), skipping culled landblocks.
- Draw static mesh pass 2 (translucent, with cull face on), skipping culled landblocks.
## Error handling & edge cases
- **Dat load fails for a landblock.** Worker logs, posts a `LoadFailed` record. Controller marks the landblock as "failed" in a small dict. The landblock stays out of the visible set until the region recenters off it and back a crude self-heal that avoids a tight retry loop.
- **Player teleports across the world** (live mode recalls, long-range recall spells). Region recomputes entirely on the next `Tick`; the old set unloads, the new set enqueues. A single multi-block hitch the first frame after teleport is acceptable it's the unavoidable cost of landing in an unloaded area.
- **Net thread dies** (socket exception, unhandled parse error). Catch, log, set `_liveSession` state to `Failed`. Render thread continues showing the last-known world. No auto-reconnect in Phase A.
- **Shutdown mid-load.** `Dispose` cancels the load thread's token, drains any in-flight completion into `/dev/null` (actually: drops them so we don't GPU-upload after disposal), joins the thread, then disposes GPU resources.
- **Observer outside the currently-loaded region** (first frame, before Tick has a chance to run). Fallback: load the 3×3 around the observer synchronously on first frame, then let streaming take over. This is the "initial preload" behavior and matches current startup semantics.
- **`StreamingRegion` at map edges.** Clamp `(lbX, lbY)` to `[0, 255]` (AC landblock coordinates are 8-bit). Landblocks beyond the edge are skipped without error.
## Testing strategy
New test projects / files:
- **`tests/AcDream.Core.Tests/Streaming/StreamingRegionTests.cs`**
- Construct with center `(50, 50)` radius `2` `Visible` is exactly the 25-cell window.
- `RecenterTo(50, 50)` empty diff.
- `RecenterTo(51, 50)` `ToLoad` has the new column (5 landblocks), `ToUnload` is empty (hysteresis keeps the departing column).
- `RecenterTo(53, 50)` (three-step jump) `ToLoad` has new columns (15 landblocks), `ToUnload` has one departing column (5 landblocks, now beyond `r+1`).
- `RecenterTo(100, 100)` (full teleport) entire old set unloads, entire new set loads.
- Edge clamping at `(0, 0)` and `(255, 255)`.
- **`tests/AcDream.Core.Tests/Streaming/FrustumCullerTests.cs`**
- Identity VP all six planes at infinity, every AABB visible.
- Known perspective with known camera pos AABB directly in front visible, AABB directly behind not visible.
- AABB straddling a plane visible (conservative).
- Edge cases: zero-size AABB on the near plane, huge AABB enclosing the camera.
- **`tests/AcDream.App.Tests/Streaming/LandblockStreamerTests.cs`** (create project if missing, copy the small existing test-project pattern)
- Fake `DatCollection` that returns canned landblocks on demand.
- Enqueue three loads drain three completions. Order-independent but arrival-order stable.
- Enqueue a load that throws `LoadFailed` arrives in the outbox; no thread crash.
- Dispose mid-load thread joins cleanly, subsequent operations are no-ops.
- **`tests/AcDream.Core.Net.Tests/NetIoThreadTests.cs`**
- Loopback UDP harness (already exists for `LiveHandshakeTests`). Spin up the session with net-thread mode on.
- Server sends N packets client's `Tick()` dispatches N events on the render thread (i.e., on the test thread, which we assert).
- Dispose while a receive is in flight clean shutdown, no thread leaks.
- Existing `LiveHandshakeTests` continue to pass unchanged proves the real handshake path still works.
Total expected new test count: ~25. Bringing totals to roughly 120 Core / 100 Core.Net / new `AcDream.App.Tests` project.
## Implementation order
Phase A is broken into four concrete increments, each independently commit-able and verifiable:
1. **A.1 Streaming (includes async dat read as its implementation, i.e., sub-piece A.4).**
- Land `StreamingRegion`, `StreamingRegionTests`, `LandblockStreamer`, `StreamingController`, and the `GameWindow` wiring.
- Verify: runtime window change via env var, walk 10 landblocks, no crashes.
- Commit point.
2. **A.2 Frustum culling.**
- Land `FrustumPlanes`, `FrustumCuller`, per-landblock AABB cache in `GpuWorldState`, wiring in `StaticMeshRenderer.Draw` and `TerrainRenderer.Draw`.
- Verify: frame time with a `radius=3` window is at least as good as the baseline with unchanged visual output.
- Commit point.
3. **A.3 Net I/O thread.**
- Split `WorldSession.PumpOnce` onto a dedicated thread, add the `Channel<ParsedGameMessage>`, keep event dispatch on the render thread.
- Verify: `LiveHandshakeTests` still pass, live session still works, no packet drops under artificial render-thread stalls.
- Commit point.
4. **A.4 (folded).** Already landed in A.1. No separate commit call this out in the A.1 commit message.
Each increment takes only one focused session. The whole phase fits in 2-3 sessions total.
## Acceptance
Phase A is **done** when all of the following are true:
1. `ACDREAM_STREAM_RADIUS=4` produces a 9×9 window on startup; `=2` produces 5×5; default (unset) is 5×5.
2. Starting in Holtburg, flying (camera) or walking (player, later phase) 10 landblocks in any cardinal direction results in no crashes, no empty void tiles, and no missing entities.
3. At a typical landblock-boundary crossing, there is no visible frame hitch loads are invisible behind the background worker.
4. Frame time with radius `2` (5×5 default) stays usable (judged by eye no numeric threshold).
5. `LiveHandshakeTests` pass, proving the net-thread split doesn't regress handshake behavior.
6. All new unit tests pass.
7. Total test count is at least 220 (up from 194).
## Open questions for implementation
None blocking. Items noted for the implementation plan:
- **Observer position in live+offline:** exact fallback ordering when live is starting up but not yet InWorld. Minor fall back to camera until first position update.
- **GPU memory reclamation rhythm:** do we unload immediately on drain, or hold a small LRU? Start with immediate; upgrade to LRU if thrashing becomes visible.
- **Channel bound size:** pick unbounded for simplicity; revisit if completions pile up.
These get resolved during implementation; none changes the spec.