acdream/docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md
Erik a28a5b7583 docs(A.5 T27): spec + plan amendments for T22.5 + ship
Spec (2026-05-09-phase-a5-two-tier-streaming-design.md):
- §2 acceptance metrics reshaped from absolute 240 FPS to
  refresh-rate-relative + per-preset (95th-pct ≤ 1000ms/refresh
  standstill; ≤ 1.5× walking) to match the Quality Preset reality.
- New §4.10 Quality Preset System (T22.5): enum Low/Medium/High/Ultra,
  QualitySettings schema, canonical preset values table, env-var
  override table, wiring notes (GameWindow.OnLoad + ReapplyQualityPreset),
  MSAA mid-session unsupported caveat, file list, test count (12).
- New §11 What was deferred: 8 items (Tier 1 cache, lifestone, JobKind
  plumbing, Tier 2/3, ToEntries alloc, InvalidateEntity wiring, High
  preset retest). Former §11 References renumbered to §12.

Plan (2026-05-09-phase-a5-two-tier-streaming.md):
- New Task 22.5 section inserted between T22 and T23: full inline spec
  with schema, preset table, env-var list, wiring steps, acceptance
  criteria, deferred items, commit SHAs. Includes file-name corrections
  (SettingsState → DisplaySettings, DisplayTab → SettingsPanel).
- Self-review cross-check table: new §4.10 row pointing at T22.5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 10:06:26 +02:00

37 KiB
Raw Permalink Blame History

Phase A.5 — Two-tier Streaming + Horizon LOD — Design

Created: 2026-05-09 (immediately after N.5b ship + brainstorm). Status: Spec — awaiting user review before plan-writing. Branch: claude/hopeful-darwin-ae8b87 (worktree under .claude/worktrees/hopeful-darwin-ae8b87). Predecessor: Phase N.5b SHIP at 08b7362. A.5 handoff at f7f8867.


1. Goal

Scale acdream's visible reach from radius=5 (~1 km) to radius=12 (~2.3 km horizon) while sustaining 240 FPS at standstill on a 240 Hz / 1440p monitor.

Delivered through:

  1. Two-tier streaming (near = full detail, far = terrain only).
  2. Tightening the existing per-LB entity dispatcher walk.
  3. Off-thread mesh build (single worker).
  4. Fog blend at the near-tier boundary to mask the scenery cutoff.
  5. Three nearly-free visual quality wins (terrain mipmaps + anisotropic, A2C with MSAA on foliage, depth-write audit).

The headline win: walking around Holtburg, the user sees a real horizon (2.3 km of visible terrain) without the client falling off a perf cliff.

User goal verbatim (2026-05-09):

"I just want great smooth HIGH fps visuals. Should look great. As long as it scales and we get very high FPS"


2. Hardware target + acceptance metrics

Target hardware

  • AMD Radeon RX 9070 XT (RDNA 4, ~December 2025).
  • 240 Hz @ 2560×1440 (verified via Get-CimInstance Win32_VideoController).
  • Frame budget: 4.166 ms at vsync.

Acceptance metrics (as shipped — revised with Quality Preset system)

  1. Build green; existing tests still green. N.5b conformance sentinel passes (visual mesh Z = TerrainSurface.SampleZ within 1 mm).
  2. Standstill at user's selected preset on user's hardware:
    • 95% of frames hit ≤ (1000ms / monitor refresh rate).
    • No absolute FPS number is required — the Quality Preset system (§4.10) is the user's knob for trading quality vs frame budget.
  3. Walking at user's selected preset:
    • 95% of frames hit ≤ 1.5× (1000ms / monitor refresh rate).
  4. First traversal into virgin region (cold mesh cache):
    • Render thread frame time stays within 2× the standstill budget while the worker fills the far-tier horizon (~2.7 s of "horizon filling in" is OK).
  5. Visual gate (user-driven, same on all presets): user launches the client, walks Holtburg → North Yanshi, and confirms:
    • Horizon visible at ~2.3 km.
    • Fog blend at N₁ smooths the scenery boundary (no harsh cliff).
    • Distant terrain does not shimmer (mipmaps work).
    • Tree edges are smooth (A2C works).
    • No new z-fighting / depth artifacts (depth-write audit).
  6. Per-subsystem regression budgets (added to [WB-DIAG] / [TERRAIN-DIAG] output):
    • Entity dispatcher cpu_us median ≤ 2.0 ms at standstill.
    • Terrain dispatcher cpu_us median ≤ 1.0 ms at standstill (all 625 LBs).
  7. N.5b sentinel intact: TerrainSlot, TerrainModernConformance, Wb*, MatrixComposition, TextureCacheBindless, SplitFormulaDivergence — all pass clean.
  8. SHIP record + perf baseline doc + memory entry mirroring N.5b's pattern.

A failure on (5) is a SHIP-blocker. A failure on (3) walking-FPS criterion escalates to "fix or document the tradeoff and ship N.6 next" — not a direct blocker but pushes the gate to user discretion.


3. Two-tier streaming model

Tier definitions

Tier Radius LB count Loads GPU mem
Near (N₁ = 4) 9×9 = 81 LBs terrain mesh + LandBlockInfo (stabs/buildings) + scenery generation + EnvCells + collision data + entity registration with WB dispatcher scenery instance buffers + per-entity textures (depends on PaletteOverrides)
Far (N₂ = 12) 25×25 - 9×9 = 544 LBs terrain mesh ONLY (LandBlock heightmap + atlas blend) ~14 MB shared atlas slots
Total 25×25 = 625 LBs combined ~30 MB total estimated

Hysteresis (Q7 Option A — match existing radius+2 convention)

  • Near-tier: entity load at distance 4, demote (entity unload) at distance 6.
  • Far-tier: terrain load at distance 12, terrain unload at distance 14.

Both boundaries get the same 2-LB buffer. Phase A.1's existing hysteresis mechanism in StreamingRegion.RecenterTo is the reference pattern; A.5 extends it from one radius to two.

Tier transitions

Transition Trigger Action
null → far LB enters far window from outside Worker reads LandBlock heightmap, builds mesh, posts LandblockStreamResult.Loaded { Tier = Far }. Render thread adds slot in TerrainModernRenderer. No entity work.
null → near LB jumps null → near in one tick (first-tick bootstrap; teleport into virgin region) Worker reads LandBlock heightmap + LandBlockInfo, generates scenery, builds entity list, builds mesh. Posts LandblockStreamResult.Loaded { Tier = Near }. Render thread adds terrain slot AND merges entities.
far → near LB enters near window from far-resident Worker reads LandBlockInfo, generates scenery, builds entity list. Posts LandblockStreamResult.Promoted. Render thread merges entities into GpuWorldState for the existing LB (terrain already loaded).
near → far LB leaves near window past hysteresis (distance > 6) Render thread drops the LB's entities from GpuWorldState (which fires _wbSpawnAdapter.OnLandblockUnloaded). Terrain stays.
far → null LB leaves far window past hysteresis (distance > 14) Render thread removes the terrain slot from TerrainModernRenderer.

The order matters: when a player walks outward, the same LB goes near → far → null over time. Each transition is one event per LB per crossing.

Why the player crossing the N₁ boundary works

The player is always at radius=0 from the streaming center (the streaming center IS the player). The boundary effects are about LBs at the edge of N₁ crossing inward/outward as the player moves. Server-spawned NPCs are delivered by ACE's broadcast (radius typically 5-7 LBs ≥ N₁), so when an LB promotes back to near, ACE will already have its NPCs broadcast or re-broadcast as the player moves through. Dat-static entities (stabs, buildings) are reloaded from LandBlockInfo on promotion. Scenery is re-generated from the deterministic seed at the same time.


4. Component-by-component design

4.1 LandblockStreamTier — new enum

namespace AcDream.App.Streaming;

public enum LandblockStreamTier
{
    Far,   // terrain only
    Near,  // full detail (terrain + entities + scenery + EnvCells)
}

4.2 StreamingRegion — extended to two radii

public sealed class StreamingRegion
{
    public int CenterX { get; }
    public int CenterY { get; }
    public int NearRadius { get; }   // N₁ (default 4)
    public int FarRadius  { get; }   // N₂ (default 12)

    public IReadOnlyCollection<uint> NearVisible { get; }   // 9×9 window
    public IReadOnlyCollection<uint> FarVisible  { get; }   // 25×25 window minus near
    public IReadOnlyCollection<uint> Resident { get; }      // hysteresis-retained

    public TwoTierDiff RecenterTo(int newCx, int newCy);
}

public readonly record struct TwoTierDiff(
    IReadOnlyList<uint> ToLoadFar,    // entered far window from null (need terrain only)
    IReadOnlyList<uint> ToLoadNear,   // entered near window from null (need terrain + entities — first-tick bootstrap, teleport)
    IReadOnlyList<uint> ToPromote,    // entered near window from far-resident (need entities only — terrain already loaded)
    IReadOnlyList<uint> ToDemote,     // exited near window past hysteresis (drop entities)
    IReadOnlyList<uint> ToUnload);    // exited far window past hysteresis (drop terrain)

The hysteresis math:

  • Near-unload threshold: NearRadius + 2 = 6.
  • Far-unload threshold: FarRadius + 2 = 14.

A landblock is "near-resident" if its distance ≤ 6; "far-resident" if its distance is in (6, 14]. Beyond 14, it unloads entirely.

4.3 StreamingController — routes by tier

public sealed class StreamingController
{
    public int NearRadius { get; set; } = 4;
    public int FarRadius  { get; set; } = 12;
    public int MaxCompletionsPerFrame { get; set; } = 4;

    // Action signatures change to carry the tier.
    private readonly Action<uint, LandblockStreamTier> _enqueueLoad;
    private readonly Action<uint> _enqueueUnload;
    // ...

    public void Tick(int observerCx, int observerCy)
    {
        // First-tick bootstrap: every near-window LB → ToLoadNear; every
        // far-window-only LB → ToLoadFar.
        // Steady-state RecenterTo: produces 5 transition lists.
        //   - ToLoadFar    → _enqueueLoad(id, JobKind.LoadFar)
        //   - ToLoadNear   → _enqueueLoad(id, JobKind.LoadNear)
        //   - ToPromote    → _enqueueLoad(id, JobKind.PromoteToNear)
        //   - ToDemote     → _state.RemoveEntities(id) on render thread (no worker job)
        //   - ToUnload     → _enqueueUnload(id)
        // Drain completions and route by result variant.
    }
}

public enum LandblockStreamJobKind { LoadFar, LoadNear, PromoteToNear }

The render thread decides the job kind up-front based on its own knowledge of which LBs are currently terrain-resident; the worker never peeks at render-thread state. Three distinct worker paths:

  • LoadFar: read LandBlock heightmap only. Skip LandBlockInfo, skip LandblockLoader.BuildEntitiesFromInfo, skip SceneryGenerator/WbSceneryAdapter. Build LandblockMesh. Post LandblockStreamResult.Loaded(Tier=Far, Entities=[], MeshData=mesh).
  • LoadNear: read LandBlock + LandBlockInfo + scenery generation
    • build mesh. Post LandblockStreamResult.Loaded(Tier=Near, Entities=..., MeshData=mesh). Used for first-tick bootstrap of the inner ring and for the rare null→Near jump (teleport into virgin region).
  • PromoteToNear: read LandBlockInfo + scenery generation only. Skip LandBlock heightmap (mesh already on GPU). Skip LandblockMesh.Build. Post LandblockStreamResult.Promoted(id, entities).

4.4 LandblockStreamResult — new variants

public abstract record LandblockStreamResult
{
    public sealed record Loaded(
        uint LandblockId,
        LandblockStreamTier Tier,
        LandBlock Heightmap,
        IReadOnlyList<WorldEntity> Entities,    // empty for Far
        LandblockMeshData MeshData              // built off-thread
    ) : LandblockStreamResult;

    public sealed record Promoted(
        uint LandblockId,
        IReadOnlyList<WorldEntity> Entities    // entity layer for an already-loaded far-tier LB
    ) : LandblockStreamResult;

    // Existing:
    public sealed record Unloaded(uint LandblockId) : LandblockStreamResult;
    public sealed record Failed(uint LandblockId, string Error) : LandblockStreamResult;
    public sealed record WorkerCrashed(string Error) : LandblockStreamResult;
}

Loaded carries MeshData — the mesh is built on the worker thread, NOT in _applyTerrain on the render thread. Promoted only carries entities; the mesh is already in TerrainModernRenderer.

4.5 LandblockStreamer — single worker, mesh-build on-worker

Existing LandblockStreamer (today on a single background thread) gets extended to:

  1. Read dat as today (DatCollection.Get<LandBlock> etc.).
  2. Build LandblockMesh on the same thread:
    var meshData = LandblockMesh.Build(
        block, lbX, lbY, heightTable, _ctx, _surfaceCache);
    
  3. Post LandblockStreamResult.Loaded(... MeshData = meshData) to the completion queue.

Thread-safety implications:

  • _ctx (TerrainBlendingContext) is read-only after init — no change.
  • _surfaceCache: today a plain Dictionary<uint, SurfaceInfo>, populated lazily by LandblockMesh.Build. Currently safe because Build runs on the render thread; A.5 moves Build to the worker, so the cache must be thread-safe. Swap to ConcurrentDictionary<uint, SurfaceInfo> with GetOrAdd for the populate path. The factory inside GetOrAdd may run twice for the same key under contention (acceptable — the result is deterministic).

4.6 WbDrawDispatcher — entity bucketing tightening (Q5 Option A)

Three targeted changes inside the existing Draw flow:

Change 1: Animated-entity walk fix

Today (at lines 197-204 of WbDrawDispatcher.cs):

foreach (var entry in landblockEntries) {
    bool landblockVisible = ...;
    if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
        continue;

    foreach (var entity in entry.Entities) {
        ...
        if (!landblockVisible && !isAnimated) continue;

The if (!landblockVisible && ...) continue; only skips if there are NO animated entities. When animatedEntityIds is non-empty, the inner loop walks every entity in the invisible LB just to find the few animated ones. With ~10.7K entities at N₁=4, this is wasted iteration.

Fix: when an LB is invisible, iterate animatedEntityIds directly and look each up in a per-LB Dictionary<uint, WorldEntity> map (added to LoadedLandblock or kept in a parallel structure).

foreach (var entry in landblockEntries) {
    bool landblockVisible = ...;
    if (!landblockVisible) {
        if (animatedEntityIds is null || animatedEntityIds.Count == 0) continue;
        // Walk only animated entities in this invisible LB.
        foreach (var animatedId in animatedEntityIds) {
            if (!entry.AnimatedById.TryGetValue(animatedId, out var entity)) continue;
            // ... draw the entity
        }
        continue;
    }
    foreach (var entity in entry.Entities) { ... }
}

Change 2: Per-entity AABB cache at register time

Today: Draw recomputes aMin = position - 5, aMax = position + 5 per entity per frame. Cheap individually, but ~16K × per frame = measurable.

Fix: add Vector3 AabbMin, AabbMax fields to WorldEntity (or a parallel struct keyed by entity id). Populate at EntitySpawnAdapter.OnCreate (server-spawned) and LandblockLoader.BuildEntitiesFromInfo (dat-static) time. Static entities never invalidate. Dynamic entities (NPCs, players) update on position change — add WorldEntity.PositionDirty flag set by the live position update path; AABB recompute happens lazily on first read after dirty.

The AABB radius today is hard-coded PerEntityCullRadius = 5.0f — keep that as a per-mesh-bucket fallback; future improvement is to compute the real AABB from the mesh, but defer that to a later phase (it's a cross-cutting change).

Change 3: 4×4 sub-LB cell cull for partially-visible LBs

When an LB is fully visible (its AABB entirely inside the frustum), all its entities are drawn — no per-entity cull needed. Today's per-entity cull is wasted work in this case.

When an LB is partially visible, today's per-entity cull is the right work — but it walks all ~132 entities. Cheap with the AABB-cache fix (memory read), so the win here is small. Worth doing only if the cache fix alone isn't enough to hit the 2.0ms budget.

Add only if needed: bucket each LB's entities into 4×4 sub-cells (each 48 m). Compute a sub-cell AABB at register time. Per frame: for partially-visible LBs, cull at sub-cell granularity first; walk entities only inside surviving sub-cells.

Ship change #1 and #2 unconditionally; ship #3 only if the budget isn't hit by #1 + #2.

4.7 TerrainModernRenderer — no structural change

The slot allocator (TerrainSlotAllocator) already grows by power-of-two doubling. At N₂=12 worst case, ~961 slots × ~15 KB per slot = ~14 MB. Allocator handles it without modification.

Per-LB frustum cull stays per-slot — at ~961 slots × ~0.3 µs/AABB-test the worst-case cull pass is ~0.3 ms. Acceptable inside the 1.0 ms terrain dispatcher budget.

The DEIC (DrawElementsIndirectCommand) array grows accordingly. The existing per-frame BufferSubData upload absorbs a 961-entry array without issue (~19 KB).

4.8 Fog tuning (SceneLightingUbo)

Existing fields (Phase G.1+):

  • FogStart — distance at which fog begins (today: somewhere outside the visible terrain range).
  • FogEnd — distance at which fog reaches full opacity.
  • FogColor — sourced from current sky state.

A.5 change: dynamically tune FogStart and FogEnd based on the current N₁/N₂:

  • FogStart = N₁ × LandblockSize × 0.74 × 192 × 0.7 = ~538 m.
  • FogEnd = N₂ × LandblockSize × 0.9512 × 192 × 0.95 = ~2188 m.

The fog color matches the current sky color (already provided by SkyStateProvider) — at the far horizon, fog blends terrain into sky, hiding the N₂ edge.

The 0.7 / 0.95 multipliers are tuning knobs. Iterate during user gate. Expose as env vars during development (ACDREAM_FOG_START_MULT, ACDREAM_FOG_END_MULT) to allow fast iteration without a recompile.

4.9 Visual quality wins (Q8 Option C — all three)

4.9.1 Mipmaps + 16x anisotropic on TerrainAtlas

Today: TerrainAtlas.Upload uses GL_LINEAR minification, no mipmaps.

A.5 change: after upload, call glGenerateMipmap(GL_TEXTURE_2D_ARRAY). Sampler state: GL_LINEAR_MIPMAP_LINEAR (trilinear) + GL_TEXTURE_MAX_ANISOTROPY = 16.

Affects only TerrainAtlas. Mesh atlas (entity textures) and other texture caches stay as-is.

Verification: at N₂=12, walk to a vantage point looking at terrain at range 2 km. With the fix, no shimmer. Without, "moving sparkles" visible at distance.

4.9.2 Alpha-to-coverage with MSAA on foliage

Today: mesh_modern.frag uses if (alpha < cutoff) discard; for ClipMap translucency. Produces hard, pixel-edged tree silhouettes.

A.5 change:

  • Enable MSAA 4x on the GL render target (window framebuffer).
  • In mesh_modern.frag, for ClipMap pass: write gl_SampleMask[0] based on alpha threshold instead of binary discard.

Risk: MSAA framebuffer interaction with sky / particles / UI overlay. Audit:

  • SkyRenderer — clears its own framebuffer? If so, must clear the MSAA attachment instead. Investigate.
  • ParticleRenderer — billboards already use alpha-blend; MSAA-friendly.
  • ImGui overlay — drawn after the 3D pass; must not interact with MSAA resolve.

If the audit finds blocking issues, ship 4.9.1 + 4.9.3 only and defer 4.9.2 to a later phase. Document the result either way.

4.9.3 Depth-write audit on translucent batches

Walk all translucent batch paths in WbDrawDispatcher.Draw and verify:

  • Alpha-blend (AlphaBlend, Additive, InvAlpha): glDepthMask(false).
  • Clip-map (binary alpha): glDepthMask(true) (foliage casts depth).
  • Opaque: glDepthMask(true).

Today's code at lines 401-433 sets DepthMask(true) for opaque, DepthMask(false) for transparent. Confirm ClipMap is in the opaque pass (it is, per IsOpaque returning true for ClipMap at line 738).

If audit finds nothing wrong, ship a comment + a unit test that locks in the partition. Cheap insurance against future regression.

4.10 Quality Preset System (T22.5 — added mid-execution)

Background: Added between T22 (fog wiring) and T23 (DIAG budgets) at user's direction. The original spec had no preset concept; §2 was written against absolute 240 FPS on fixed N₁/N₂. T22.5 makes both radii and every quality knob user-controllable via a single enum. §2 was amended above to reflect the per-preset, refresh-rate-relative acceptance criteria.

Schema

public enum QualityPreset { Low, Medium, High, Ultra }

public readonly record struct QualitySettings(
    int NearRadius,
    int FarRadius,
    int MsaaSamples,
    int AnisotropicLevel,
    bool AlphaToCoverage,
    int MaxCompletionsPerFrame);

QualitySettings.From(preset) returns the canonical values:

Preset NearRadius FarRadius MsaaSamples AnisotropicLevel AlphaToCoverage MaxCompletionsPerFrame
Low 2 5 0 4 false 2
Medium 3 8 2 8 false 3
High 4 12 4 16 true 4
Ultra 5 15 4 16 true 6

QualitySettings.WithEnvOverrides(baseSettings) applies per-field env-var overrides (see §4.10.3).

Persistence and UI

DisplaySettings.Quality (type QualityPreset) persists via the existing settings.json infrastructure (Phase L.0). The Settings panel (F11) exposes a Quality dropdown in its Display tab (SettingsPanel.RenderDisplayTab).

Wiring (GameWindow.OnLoad + ReapplyQualityPreset)

  1. GameWindow.OnLoad resolves the active QualitySettings: QualitySettings.From(displaySettings.Quality).WithEnvOverrides(...).
  2. StreamingController and LandblockStreamer are built with the preset's NearRadius / FarRadius.
  3. TerrainAtlas.SetAnisotropic(settings.AnisotropicLevel) called once at load and again on reapply.
  4. WindowOptions.Samples = settings.MsaaSamples applied at window creation time only (MSAA mid-session change is structurally unsupported by OpenGL).
  5. WbDrawDispatcher.AlphaToCoverage = settings.AlphaToCoverage.
  6. StreamingController.MaxCompletionsPerFrame = settings.MaxCompletionsPerFrame.

Mid-session quality change (F11 dropdown change → Save):

  • GameWindow.ReapplyQualityPreset rebuilds StreamingController + LandblockStreamer with the new radii, re-applies anisotropic and AlphaToCoverage.
  • If MsaaSamples changed, logs a warning that MSAA sample count cannot be changed mid-session; requires restart.

Env-var overrides (§4.10.3)

Applied by QualitySettings.WithEnvOverrides after the base preset is resolved. Each field has one env var; all are optional. Logged at startup.

Env var Field overridden
ACDREAM_NEAR_RADIUS NearRadius
ACDREAM_FAR_RADIUS FarRadius
ACDREAM_MSAA_SAMPLES MsaaSamples
ACDREAM_ANISOTROPIC AnisotropicLevel
ACDREAM_A2C AlphaToCoverage (1/0/true/false)
ACDREAM_MAX_COMPLETIONS_PER_FRAME MaxCompletionsPerFrame

Tests

12 tests in tests/AcDream.UI.Abstractions.Tests/Settings/QualityPresetTests.cs cover: canonical preset values per enum member; WithEnvOverrides no-op when no env vars set; WithEnvOverrides each override individually; invalid env-var value falls back to base setting.

Files

  • src/AcDream.UI.Abstractions/Settings/QualityPreset.cs — new
  • src/AcDream.UI.Abstractions/Settings/DisplaySettings.csQuality field added
  • src/AcDream.UI.Abstractions/Panels/Settings/SettingsPanel.cs — Display tab Quality dropdown (RenderDisplayTab method)
  • src/AcDream.App/Rendering/GameWindow.csReapplyQualityPreset, OnLoad preset wiring
  • tests/AcDream.UI.Abstractions.Tests/Settings/QualityPresetTests.cs — new (12 tests)

Out of scope (deferred)

  • Auto-detect preset on first launch (Phase A.6 / N.6.5).
  • Adaptive runtime preset drop on budget miss.
  • Per-feature toggles below preset level.

Commits: afa4200 (schema + tests), 28d2c60 (wiring).


5. Data flow

Per-frame (steady state)

GameWindow.OnUpdate(dt)
  └─ StreamingController.Tick(playerCx, playerCy)
      ├─ region.RecenterTo(...)  // produces TwoTierDiff if center changed
      ├─ for each ToLoadFar:   _enqueueLoad(id, LoadFar)
      ├─ for each ToLoadNear:  _enqueueLoad(id, LoadNear)
      ├─ for each ToPromote:   _enqueueLoad(id, PromoteToNear)
      ├─ for each ToDemote:    _state.RemoveEntities(id)  // on render thread
      ├─ for each ToUnload:    _enqueueUnload(id)
      └─ drainCompletions(MaxCompletionsPerFrame=4)
          ├─ Loaded.Far:    _terrain.AddLandblock(meshData);  _state.AddLandblock(...)
          ├─ Loaded.Near:   _terrain.AddLandblock(meshData);  _state.AddLandblock(... entities)
          ├─ Promoted:      _state.AddEntitiesToExisting(id, entities)
          ├─ Unloaded:      _terrain.RemoveLandblock(id);  _state.RemoveLandblock(id)
          └─ Failed/Crash:  log

GameWindow.OnRender
  ├─ TerrainModernRenderer.Draw(camera, frustum)
  │   └─ glMultiDrawElementsIndirect across all near + far slots that pass cull
  └─ WbDrawDispatcher.Draw(camera, gpuWorldState.LandblockEntries, frustum, visibleCellIds, animatedEntityIds)
      ├─ for each LB entry:
      │   ├─ if invisible: walk only animatedEntityIds (Change #1)
      │   └─ if visible: walk entities, AABB cache lookup (Change #2)
      ├─ classify into groups, build SSBO, multi-draw indirect
      └─ flush DIAG every ~5 s

Worker thread

LandblockStreamer.WorkerLoop
  while running:
    job = jobQueue.dequeue()
    switch job.Kind:
      LoadFar:
        block = dats.Get<LandBlock>(id)
        meshData = LandblockMesh.Build(block, ..., _surfaceCache)
        completionQueue.enqueue(Loaded(id, Far, block, [], meshData))
      LoadNear:
        block = dats.Get<LandBlock>(id)
        info  = dats.Get<LandBlockInfo>(...)
        entities = LandblockLoader.BuildEntitiesFromInfo(info)
        scenery  = WbSceneryAdapter.GenerateScenery(block, ...)
        meshData = LandblockMesh.Build(block, ..., _surfaceCache)
        completionQueue.enqueue(Loaded(id, Near, block, entities  scenery, meshData))
      PromoteToNear:
        info  = dats.Get<LandBlockInfo>(...)
        // Heightmap not re-read; scenery generation needs LandBlock for height
        // sampling — read it again from disk cache (DatCollection caches the
        // last-read block; cheap second access) OR pass through from render
        // thread's terrain-slot snapshot (deferred plan-level decision).
        block = dats.Get<LandBlock>(id)
        entities = LandblockLoader.BuildEntitiesFromInfo(info)
        scenery  = WbSceneryAdapter.GenerateScenery(block, ...)
        completionQueue.enqueue(Promoted(id, entities  scenery))

6. Threading model

  • Render thread: drives StreamingController.Tick, drains the completion queue, calls TerrainModernRenderer.AddLandblock / RemoveLandblock, mutates GpuWorldState. All GL calls on this thread.
  • One streaming worker thread: dat reads, mesh build, scenery generation. Owns _surfaceCache (now ConcurrentDictionary) — render thread does not access it directly.
  • Network thread: unchanged from Phase A.3 — drains UDP into the channel; render thread decodes.

Synchronization:

  • Job queue: Channel<LbStreamJob> (writer = render thread via _enqueueLoad; reader = worker).
  • Completion queue: ConcurrentQueue<LandblockStreamResult> (writer = worker; reader = render thread).
  • _surfaceCache: ConcurrentDictionary<uint, SurfaceInfo> populated by LandblockMesh.Build on the worker; read by future paths if any (none today).
  • TerrainBlendingContext: read-only post-init. No lock.

7. Error handling

  • Worker crash: caught in worker loop, posts LandblockStreamResult.WorkerCrashed. Render thread logs to console. (Existing pattern.)
  • Dat read failure: posts LandblockStreamResult.Failed. Render thread logs. Streaming continues with the LB skipped — region still tracks it as resident so we don't retry forever, but the slot stays empty.
  • AABB cache invalidation race: dynamic entity moves while the dispatcher is walking. Acceptable — at worst, the entity culls or draws based on the previous frame's position. Position is updated in the network handler (also render-thread today) so no actual race.
  • Promotion timing: if the player crosses N₁ inward, we enqueue a Near load on the worker. Until it completes, the LB has terrain but no scenery / entities. Frame budget is unaffected (only LoadedLandblock changes, and the dispatcher already handles missing entities by walking zero-length lists).
  • Unload during in-flight load: enqueue an unload while a load is in flight. When the load completes, render thread sees the LB is no longer resident — drop the result silently. Same pattern as today.

8. Testing strategy

Unit tests (offline, no GL)

Add to tests/AcDream.Core.Tests/Streaming/:

  • StreamingRegion_TwoTier_FirstTick_LoadsNearAndFarSeparately — first call produces ToLoadNear populated for inner ring, ToLoadFar populated for outer ring, ToPromote empty (nothing was previously resident).
  • StreamingRegion_TwoTier_NullToFar_OnFarRingEntry — LB rolls into far window from null. Asserts entry in ToLoadFar, not ToLoadNear.
  • StreamingRegion_TwoTier_FarToNear_OnNearRingEntry — LB was far-resident, player walks toward it, LB enters near window. Asserts entry in ToPromote, not ToLoadNear.
  • StreamingRegion_TwoTier_NullToNear_OnTeleport — observer center jumps far enough that an LB goes from null → Near in one frame (e.g., teleport). Asserts entry in ToLoadNear, not ToPromote.
  • StreamingRegion_TwoTier_NearToFar_OnNearBoundaryExitPlusHysteresis — asserts entry in ToDemote only after distance exceeds NearRadius + 2.
  • StreamingRegion_TwoTier_FarToNull_OnFarBoundaryExitPlusHysteresis — asserts entry in ToUnload only after distance exceeds FarRadius + 2.
  • StreamingRegion_TwoTier_HysteresisHoldsAcrossOscillation — walk back-and-forth across N₁ five times within the hysteresis radius; assert no demote events fire.
  • StreamingController_TwoTier_DrainsRoutedByVariantLoaded.Far, Loaded.Near, and Promoted each route to the right state mutation on the render thread.

Add to tests/AcDream.Core.Tests/Rendering/Wb/:

  • WbDrawDispatcher_AnimatedEntities_InInvisibleLb_NoFullEntityWalk — verify Change #1 (only iterates animatedEntityIds, not Entities).
  • WbDrawDispatcher_PerEntityAabbCached_NotRecomputed — assert AABB fields are read, not recomputed, for static entities.

Conformance tests

  • TerrainModernConformanceTests (existing) — must still pass. The visual mesh Z must agree with TerrainSurface.SampleZFromHeightmap to within 1 mm across both tiers.
  • LandblockMeshTests (existing) — must still pass. Worker-thread mesh build produces byte-identical results to render-thread build for the same inputs.

Perf gate (manual, with [WB-DIAG] + [TERRAIN-DIAG])

  • Standstill bench: launch with ACDREAM_WB_DIAG=1, stand at Holtburg dueling field for 60 s. Read median + p95 + p99 from log.
  • Walking bench: launch with diag, run from Holtburg to North Yanshi, ~60 s. Same metrics.
  • First traversal bench: clear OS file cache (or reboot), launch with diag, walk into a region not previously visited, capture the worker-thread fill duration + render-thread frame time during fill.

Visual gate (manual, user-driven)

User launches the client, walks the standard route, confirms:

  1. Horizon visible at 2.3 km.
  2. Fog blend is smooth (no scenery cliff at N₁).
  3. No shimmer on distant terrain.
  4. Smooth tree edges (foliage A2C).
  5. No new z-fighting / depth artifacts.

9. Out of scope (explicitly deferred)

Per the brainstorm Q10 confirmation:

  • GPU-side culling (compute pre-pass) — N.6.
  • Persistent-mapped indirect buffer — N.6.
  • Multi-thread mesh-build worker pool — N.6 if first-traversal fill feels too slow at gate.
  • Static/dynamic persistent groups (Q5 Option B — the "compute the group key once at spawn" architecture change) — separate later phase (likely A.6 or N.6.5).
  • Billboard / impostor scenery at far tier — escalation only if the fog'd terrain horizon looks too bare at gate.
  • Wider N₁ hysteresis (Option C, radius+3) — single-line tweak only if gate finds entity pop-in along the boundary.
  • Far-tier terrain mesh LOD (decimating 2×2 LBs) — not needed at N₂=12; revisit only if N₂ grows beyond 15.
  • Sky / particles modern path migration — N.7+ phases.
  • EnvCell modern path migration — separate phase.
  • Shadow mapping — separate visual phase, later.
  • Strict 240 Hz during walking (Q9 Option A) — graduate to in a perf-polish phase if we want to commit to it.

10. Risks

  1. Fog tuning visual gate (highest risk). Hardest non-engineering risk. The 0.7 / 0.95 multipliers in §4.8 are first-cut numbers. If the fog band is too thin (visible scenery cliff at N₁) or too thick (terrain looks washed out), iterate on the multipliers. Mitigation: expose FogStart / FogEnd as tunable env vars during A.5 development for fast iteration.
  2. A2C / MSAA framebuffer interaction (moderate risk). MSAA on the GL render target may break sky / particles / UI rendering. Audit during implementation. **Fallback: ship Q8 Option B (mipmaps
    • depth-audit only) if A2C goes sideways.** Document the result.
  3. Worker starvation on first-traversal (low-moderate risk). ~2.7 s of sequential mesh build on first walk into virgin region. Render thread frame time stays in budget; the visible effect is the horizon visibly filling. Acceptable per Q9 Option B; graduate to multi-worker pool in N.6 if user complains.
  4. Tier-boundary churn (low risk). When player crosses N₁ both directions, demote→promote→demote fires. Hysteresis (radius+2) is the buffer. If thrash visible, widen to radius+3.
  5. Entity AABB cache invalidation (low risk). Dynamic entities must recompute AABB on position change. Single-threaded render thread means no concurrent mutation; the dirty-flag pattern is straightforward.
  6. Server broadcast radius mismatch (low risk). If ACE's broadcast radius is < N₁=4, NPCs in outer near-tier LBs won't be server-broadcast (they don't exist in our state). Mitigation: N₁=4 is conservative — typical ACE configs broadcast at 5-7 LBs. If observed, drop N₁ to 3.

11. What was deferred (post-A.5)

The following items were identified during A.5 development but deferred to post-A.5 phases. They are tracked as OPEN issues in docs/ISSUES.md.

  1. Tier 1 entity-classification cache (commit 3639a6f reverted at 9b49009): First attempt cached meshRef.PartTransform which is mutated per frame for animated entities (skeletal pose). Next attempt needs: (a) audit AnimationSequencer + AnimationHookRouter to identify ALL per-frame mutations of MeshRef state; (b) redesign cache to bypass animated entities OR cache only the animation-invariant subset; (c) test specifically with a moving animated NPC on screen. (docs/ISSUES.md #53)

  2. Lifestone missing visual: The Holtburg lifestone has not rendered since earlier in A.5 development. Possibly Bug A's far-tier strip incorrectly catching a near-tier entity, or a separate earlier regression. (docs/ISSUES.md #52)

  3. Plumb JobKind through BuildLandblockForStreaming: Bug A's fix (commit 9217fd9) strips entities post-load in the worker. Proper fix: skip the LandBlockInfo + scenery load entirely for far-tier jobs. ~30 min. (docs/ISSUES.md #54)

  4. Tier 2 — Static/dynamic split with persistent groups: ~2-week phase. Avoids per-frame entity re-classification by maintaining stable groups keyed at spawn time. Roadmap doc at docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md.

  5. Tier 3 — GPU-side culling via compute pre-pass: ~1-month phase. Same roadmap doc.

  6. Eliminate ToEntries adapter allocation: tiny win (~25 KB/frame).

  7. InvalidateEntity wiring on palette/ObjDesc events: needed by the Tier 1 retry.

  8. Visual gate at full High preset: never validated due to the GPU+CPU stack-up OS crash earlier in A.5. With Bug A fixed the crash likely won't recur; defer retest to post-A.5 perf polish.


12. References (formerly §11)