diff --git a/docs/research/2026-05-09-phase-n5b-handoff.md b/docs/research/2026-05-09-phase-n5b-handoff.md new file mode 100644 index 0000000..05d7558 --- /dev/null +++ b/docs/research/2026-05-09-phase-n5b-handoff.md @@ -0,0 +1,445 @@ +# Phase N.5b — Terrain on the Modern Rendering Path — Cold-Start Handoff + +**Created:** 2026-05-09, immediately after N.5 ship + roadmap A.5 addition. +**Audience:** the next agent picking up terrain rendering work. +**Purpose:** give you everything you need to start N.5b cold, without +spelunking through the N.5 session's history. + +--- + +## TL;DR + +N.5 just shipped: `WbDrawDispatcher` lifts entity rendering onto bindless +textures + `glMultiDrawElementsIndirect`. CPU dispatcher 1.23 ms / frame +median at Holtburg courtyard, ~810 fps sustained. **Entities only — +terrain is still on a separate legacy renderer.** + +**N.5b's job: port terrain rendering onto the same modern primitives that +N.5 just delivered.** Concretely: + +1. Replace `TerrainRenderer` + `TerrainChunkRenderer` (per-landblock VAO, + `glDrawElements`, `sampler2D` atlases) with a multi-draw-indirect + dispatcher analogous to `WbDrawDispatcher`, sharing the modern path's + bindless texture infrastructure where it makes sense. +2. Keep terrain visually identical to today. The legacy `TerrainAtlas` + + `terrain.vert/.frag` already render correctly; don't introduce visual + regressions. +3. Resolve issue #51 (WB's terrain split formula diverges from retail's + `FSplitNESW`) — see "Load-bearing constraint" below. + +The roadmap estimate is **~1 week** because the modern-path primitives +are already built. The actual work is porting + bridging + a real +correctness decision on the split formula. + +--- + +## Load-bearing constraint: Issue #51 (terrain split formula) + +This is the design decision that will dominate the brainstorm. **Read +`docs/ISSUES.md` issue #51 in full before brainstorming.** + +The short version: + +- **acdream's terrain split formula** is the retail-decomp `FSplitNESW` + (constants `0x0CCAC033` / `0x421BE3BD` / `0x6C1AC587` / `0x519B8F25`). + Documented in `CLAUDE.md` as **the** real AC formula. Ours is degree-2 + polynomial in (x,y). Used by: + - `src/AcDream.Core/Physics/TerrainSurface.cs:113-120` (physics — + `IsSplitSWtoNE`) + - `src/AcDream.Core/World/TerrainBlending.cs` (visual mesh) +- **WB's terrain split formula** in `references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44` + is LINEAR in (x,y). Different math; they cannot be algebraically + equivalent. They disagree on a meaningful fraction of cells — up to + ~2m height delta on sloped cells. +- **WB's `TerrainGeometryGenerator`** (the obvious adoption target for + N.5b's mesh path) uses WB's formula. If we adopt it wholesale, our + visual terrain disagrees with our physics (which uses retail's + formula). Player floats / sinks. Already-fixed bug class returns. + +**Three viable design paths** (the brainstorm has to pick one): + +- **Path A — Adopt WB's formula everywhere.** Switch both physics AND + visual mesh to WB's `CalculateSplitDirection`. Use WB's + `TerrainGeometryGenerator` directly. Visual + physics stay synced. + Risk: physics now disagrees with retail server-authoritative Z by up + to ~2m on sloped cells. Server-side validation (if any) might reject + movements; the player might "snap" to server's Z when packets land. + Need to confirm whether ACE actually validates Z or trusts the + client. Lowest implementation effort. + +- **Path B — Keep retail's formula; fork-patch WB.** Patch + `references/WorldBuilder/.../TerrainUtils.cs` to use retail's formula + in our fork. Push the patch to the `acdream` branch of the fork (per + the WB submodule plumbing fixed in the previous session). Submit + upstream PR if Chorizite wants it. Most retail-faithful. Implementation + effort: medium. Coordination overhead with upstream. + +- **Path C — Use WB's mesh layout but our formula.** Don't use WB's + `TerrainGeometryGenerator` directly. Instead port WB's *mesh layout* + (vertex buffer shape, index buffer per landblock, atlas integration) + into a new acdream-side `TerrainGeometryGenerator` that uses retail's + formula. Highest effort but cleanest separation — no fork patches. + +Recommendation in the brainstorm: probably **Path A** if quantification +shows ACE doesn't validate Z aggressively (retail's network protocol +is "client tells server position; server trusts within sanity bounds"), +otherwise **Path B**. Path C is overengineered for the level of +divergence. + +**Step 1 of the brainstorm:** quantify the divergence. Run WB's formula ++ retail's formula across all (lbX, lbY, cellX, cellY) tuples for +several representative landblocks (Holtburg, Foundry, open landscape, +some sloped terrain like Direlands). Record disagreement rate. If <5% +of cells disagree, Path A's risk is bounded; if >20%, Path B becomes +more attractive. + +--- + +## Where N.5 left things + +### Branch state + +After last session: +- `main` is at `a64cd11` ("docs(roadmap): add A.5 — two-tier streaming") +- N.5 SHIP at `27eaf4e` (merge commit) +- N.5 ship-amendment at `e0dbc9c` (legacy renderers retired) +- Legacy `InstancedMeshRenderer` + `StaticMeshRenderer` + `WbFoundationFlag` + ARE GONE. Bindless is mandatory; missing extensions throws + `NotSupportedException` at startup. + +### What works in N.5 + +- **Entity rendering:** `WbDrawDispatcher` does ~12-15 GL calls per frame + for all visible entities regardless of scene complexity. Three SSBO + uploads (instance matrices @ binding=0, batch data @ binding=1, + indirect commands) + 2 `glMultiDrawElementsIndirect` calls (opaque + + transparent passes). +- **Bindless texture infrastructure:** `BindlessSupport` wrapper + + `TextureCache` parallel `UploadRgba8AsLayer1Array` path + three + `Bindless*` `GetOrUpload` methods + two-phase `Dispose`. All textures + on the WB modern path are 1-layer `Texture2DArray` + `sampler2DArray`. +- **mesh_modern.vert/.frag** preserves the full `SceneLighting` UBO + (8 lights + fog + lightning flash + per-channel clamp) — visual + identity to N.4 confirmed at user gates. +- **Diagnostic:** CPU stopwatch + GL_TIME_ELAPSED queries logged via + `[WB-DIAG]` (GPU timing currently shows 0/0 — query polling needs + double-buffering, deferred to N.6). + +### What N.5b inherits + +These are levers N.5b will pull on: + +- **`BindlessSupport`** at `src/AcDream.App/Rendering/Wb/BindlessSupport.cs` + — already wraps `ArbBindlessTexture`. Reusable for terrain textures. +- **`DrawElementsIndirectCommand` struct** at `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs` + — 20-byte layout, ready to populate per-landblock terrain commands. +- **`BuildIndirectArrays` helper** at `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` + — pure CPU layout helper, currently scoped to entities; could + generalize for terrain. +- **`TextureCache`** with parallel Texture2DArray bindless cache — + but terrain has its own `TerrainAtlas` (multi-layer texture array + for splat blending). N.5b decides whether to integrate or keep + separate. +- **`SceneLightingUbo`** at binding=1 — terrain.frag already consumes + it; the new modern terrain shader continues that. +- **Retail's `FSplitNESW`** in `src/AcDream.Core/World/TerrainBlending.cs` + — the formula to preserve (or replace, per Path A/B/C decision). + +### What still uses the legacy path (NOT N.5b's job) + +- **Sky rendering** (`SkyRenderer.cs`) — N.8 territory. +- **Particles** (`ParticleRenderer.cs`) — N.8 territory. +- **Debug lines** (`DebugLineRenderer.cs`) — fine as-is. +- **UI / text** (`TextRenderer.cs` + ImGui) — fine as-is; ImGui has its + own backend. + +--- + +## What N.5b is — technical detail + +### Today's terrain stack (1383 lines acdream + ~140 lines shaders) + +| File | Lines | Role | +|---|---|---| +| `src/AcDream.App/Rendering/TerrainRenderer.cs` | 247 | Top-level orchestration; per-landblock cull + draw | +| `src/AcDream.App/Rendering/TerrainChunkRenderer.cs` | 454 | Per-landblock VAO + IBO management; `glDrawElements` per visible chunk | +| `src/AcDream.App/Rendering/TerrainAtlas.cs` | 386 | Multi-layer `Texture2DArray` atlas for terrain splat textures | +| `src/AcDream.App/Rendering/Shaders/terrain.vert` | 147 | Per-vertex world position, normal, UV, palCode | +| `src/AcDream.App/Rendering/Shaders/terrain.frag` | 149 | Splat blending across 4 corner textures | + +**Per-frame today:** for each visible landblock, bind its VAO + IBO, +bind the terrain texture atlas, set per-landblock uniforms, issue +`glDrawElements`. With 25 landblocks at default radius=2, that's ~25 +draw calls per frame for terrain (cheap, but doesn't scale). + +### WB's terrain stack (1937 lines + ~200 lines shaders) + +| File | Lines | Role | +|---|---|---| +| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainRenderManager.cs` | 1023 | Top-level coordinator; uses multi-draw-indirect already | +| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainGeometryGenerator.cs` | 326 | Mesh generation per landblock (uses WB's split formula — see #51) | +| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/LandSurfaceManager.cs` | 588 | Texture atlas management + alpha mask generation for splat blending | +| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Shaders/Landscape.vert/.frag` | ~200 | Modern shader; consumes SSBO instance data + bindless atlas handle | + +WB's renderer is structurally close to what N.5b targets. Key differences +from acdream: + +- WB uses **uint32 indices** (`DrawElementsType.UnsignedInt`) for + terrain — landblocks have more vertices than fit in ushort range. + N.5's `WbDrawDispatcher` uses `UnsignedShort` for entities. +- WB packs all visible terrain into shared mesh buffers + dispatches + via `glMultiDrawElementsIndirect`. We can mirror that pattern. +- WB's `LandSurfaceManager` builds per-landblock alpha masks for splat + blending; this is the bulk of its 588 lines. Different model from + our `TerrainAtlas` which uses palCode-based blending in the fragment + shader. + +### What N.5b actually does + +Roughly four sub-pieces: + +1. **Terrain mesh on global VBO/IBO.** Following N.5's pattern, all + visible terrain landblocks pack into a single global vertex buffer + + index buffer. Per-landblock entries become `DrawElementsIndirectCommand` + records with `firstIndex` + `baseVertex` offsets. One + `glMultiDrawElementsIndirect` call per pass. +2. **Bindless terrain atlas.** Either (a) port `TerrainAtlas` to use + bindless handles + sampler2DArray (small change, keeps current + blending math), or (b) adopt WB's `LandSurfaceManager` (bigger + change, switches to alpha-mask blending). Brainstorm decides. +3. **New shader `terrain_modern.vert/.frag`** that: + - Reads per-landblock data from an SSBO (analogous to + mesh_modern's `Batches[]`) + - Samples the terrain atlas via bindless `sampler2DArray` handle + - Continues to consume `SceneLighting` UBO @ binding=1 (no + visual identity regression vs N.4 — same lighting math) +4. **Resolve issue #51** per Path A/B/C decision in the brainstorm. + +### Per-frame target shape + +``` +// Once at init: +Build global terrain VAO + VBO + IBO (resizable; grows as landblocks stream in) +Generate bindless handles for terrain atlas + +// Per frame: +1. Frustum cull landblocks (existing per-landblock AABB test) +2. Build per-visible-landblock IndirectGroupInput list +3. Upload _terrainBatchSsbo + _terrainIndirectBuffer +4. glBindVertexArray(globalTerrainVao) +5. glBindBufferBase(SHADER_STORAGE_BUFFER, 1, _terrainBatchSsbo) +6. glBindBuffer(DRAW_INDIRECT_BUFFER, _terrainIndirectBuffer) +7. glMultiDrawElementsIndirect(...) // ONCE per pass — opaque pass + (terrain has no transparent; one indirect call total) +``` + +Total ~6-8 GL calls per frame for terrain regardless of scene size. +At radius=5 (121 landblocks) this is the same number of GL calls as +at radius=2 (25 landblocks). + +--- + +## Files to read before brainstorming + +In rough order: + +1. **`docs/ISSUES.md` issue #51** (49-103). Load-bearing constraint. +2. **`CLAUDE.md`** the "Reference hierarchy by domain" terrain row + + "Reference repos: check ALL FOUR" — terrain math is one of the + places where checking multiple references matters most. +3. **acdream terrain stack:** + - `src/AcDream.App/Rendering/TerrainRenderer.cs` (247 lines, easy + read) + - `src/AcDream.App/Rendering/TerrainChunkRenderer.cs` (454 lines — + this is the per-landblock GL plumbing that goes away in N.5b) + - `src/AcDream.App/Rendering/TerrainAtlas.cs` (386 lines — + multi-layer atlas) + - `src/AcDream.App/Rendering/Shaders/terrain.vert/.frag` (~300 + lines combined) + - `src/AcDream.Core/World/TerrainBlending.cs` (the FSplitNESW + side; preserve or replace) +4. **WB terrain stack:** + - `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainRenderManager.cs` + (1023 lines — the model to mirror; multi-draw indirect already + in place) + - `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainGeometryGenerator.cs` + (326 lines — uses WB's split formula; per #51) + - `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/LandSurfaceManager.cs` + (588 lines — alpha-mask atlas; alternative to our `TerrainAtlas`) + - `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Shaders/Landscape.vert/.frag` + - `references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44` + (CalculateSplitDirection — WB's formula) +5. **N.5 plan + spec** (cribs for the modern-path pattern): + - `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md` + (what we did, including amendments) + - `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md` + (decisions log) +6. **Memory: `project_phase_n5_state.md`** — three high-value gotchas + from N.5 (texture target lock-in, bindless Dispose order, + GL_TIME_ELAPSED double-buffering). All apply to N.5b. + +--- + +## Brainstorm questions + +These are the questions to resolve in the brainstorm step. Don't +prejudge — bring them to the user with options + recommendation: + +1. **Path A vs B vs C** for issue #51 (the terrain split formula). The + biggest decision; everything else flows from it. Should be + informed by quantifying the divergence rate first (run both + formulas across representative landblocks). + +2. **Atlas model.** Keep `TerrainAtlas` (palCode-based fragment shader + blending) and just bindless-ify it, or adopt WB's `LandSurfaceManager` + (alpha-mask blending)? Tradeoff: minimal change vs alignment with + WB. Visual outcome should be identical either way. + +3. **Mesh ownership.** Use a single global VBO/IBO for all terrain + (mirror N.5's pattern), or per-landblock VBO/IBO with multi-draw + indirect over them? Single global is more cache-friendly + more + like N.5, but requires resizable buffer management. Per-landblock + is simpler but doesn't share the IBO across draws. + +4. **Index format.** N.5 uses `UnsignedShort` (max 64K verts per + draw). Terrain landblocks have many more verts than that. WB uses + `UnsignedInt`. Just commit to `UnsignedInt` for terrain? + +5. **Shader unification.** Separate `terrain_modern.vert/.frag` or + merge with `mesh_modern.vert/.frag` via uniforms? Probably separate + since the vertex layouts differ (terrain has palCode; entities + have UV). + +6. **Streaming integration.** Today's `TerrainChunkRenderer` integrates + with the streaming loader (landblocks come and go). N.5b's global + buffer model needs a strategy for adding/removing landblocks from + the global VBO/IBO without per-add reallocation. Free-list / + compaction / fixed-slot allocator? + +7. **Conformance test.** Per the lessons from N.2, "WB's terrain + formula differs from retail" — we need a test that proves our + visual terrain matches our physics terrain (i.e., visual mesh Z + at any (X,Y) equals `TerrainSurface.GetHeight(X,Y)`). Run a sweep + across ~1M (X,Y) points; assert |delta| < epsilon. + +8. **Visual verification gate.** Holtburg + Foundry + sloped terrain + (Direlands?) + cell transitions. The split-formula-disagreement + bug class shows up as terrain "wobble" at cell boundaries — that's + the specific thing to look for. + +--- + +## Acceptance criteria for the whole phase + +- Visual terrain identical to current legacy path (no missing chunks, + no z-fighting at cell boundaries, no texture seams) +- `[WB-DIAG]` shows terrain accounting for ~6-8 GL calls per frame + regardless of scene size (currently scales with visible landblock + count, ~25-121 calls) +- Frame time measurably lower in dense-terrain scenes (specify scenes + in the spec — probably radius=5 outdoor roaming) +- Conformance test: visual mesh Z agrees with `TerrainSurface.GetHeight` + within epsilon across a 1M-point sweep +- All existing tests still green +- The split-formula decision (#51) is resolved with a clear writeup + in the spec + +--- + +## What you'll be doing in the first 30 minutes + +1. Read this handoff in full. +2. Read `docs/ISSUES.md` issue #51 in full. +3. Read CLAUDE.md "Reference hierarchy by domain" terrain row. +4. Read `TerrainRenderer.cs` + `TerrainChunkRenderer.cs` end-to-end. +5. Skim `TerrainRenderManager.cs` (WB's) — at least the multi-draw + indirect dispatch section. +6. Verify build is green: `dotnet build`. +7. Verify N.5 ship is intact: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"` should produce 71 passing tests, 0 failures. +8. Quantify the formula divergence (Path A/B/C decision input): + write a one-shot test that runs both formulas across all + (lbX, lbY, cellX, cellY) tuples for ~10 representative landblocks + and reports disagreement rate. +9. Invoke the `superpowers:brainstorming` skill with the user. Walk + through the 8 brainstorm questions above. Bring the formula + divergence number to inform the Path A/B/C decision. +10. Write the spec. +11. Write the plan. +12. Begin Week 1 implementation per the plan. + +Don't skip the brainstorm. The terrain split formula decision (Path +A/B/C) has real downstream consequences — physics, server-Z agreement, +fork-patching of WB. Needs explicit user input, not "the agent makes +a call and goes." This phase is structurally the same shape as N.5 — +brainstorm → spec → plan → tasks-with-checkboxes → commits-update-checkboxes +→ final SHIP commit. + +--- + +## Things to NOT do + +- **Don't adopt WB's terrain code wholesale without resolving #51 + first.** The split formula decision affects the entire pipeline; + patching it after-the-fact requires re-doing visual + physics + the + TerrainGeometryGenerator port. +- **Don't introduce a per-cell wobble at landblock boundaries.** That's + the visible signature of the formula disagreement. If you see it + during visual verification, the formula isn't aligned between your + physics and visual paths. +- **Don't break the existing `[WB-DIAG]` instrumentation.** Add a + separate counter for terrain (`terrainDrawsIssued`) so the entity + + terrain perf can be observed independently. +- **Don't bundle A.5 (two-tier streaming + horizon LOD) into this + phase.** N.5b is "terrain on modern path"; A.5 is "split the radius + + LOD." Different scopes, different brainstorms. A.5 might become + natural to pick up next once N.5b lands. +- **Don't try to re-port `FSplitNESW` if you're going Path A.** The + whole point of Path A is to commit to WB's formula. If you keep + retail's formula via Path B/C, do it once, definitively. +- **Don't skip the formula-divergence quantification.** Step 8 of + the first 30 minutes. The Path decision should be data-informed, + not gut-feel. <5% divergence makes Path A bounded-risk; >20% makes + Path B/C more attractive. +- **Don't skip visual verification.** The split-formula bug class + shows up as cell-boundary wobble that's hard to spot in screenshots + but obvious in motion. Walk a sloped landblock during verification. +- **Don't extend the phase scope.** N.5b is "terrain on modern path." + Sky, particles, EnvCells — all subsequent phases. If the brainstorm + tries to expand, push back. + +--- + +## Reference: the N.5 dispatcher flow you're mirroring + +``` +WbDrawDispatcher.Draw(...) { + // Phase 1: walk entities, build groups + // Phase 2: lay matrices contiguously + // Phase 3: build BatchData + DEIC arrays via BuildIndirectArrays + // Phase 4: upload 3 SSBOs (instances, batches, indirect) + // Phase 5: bind global VAO + SSBOs + // Phase 6: opaque pass — glMultiDrawElementsIndirect + // Phase 7: transparent pass — glMultiDrawElementsIndirect +} +``` + +For terrain the shape is similar but simpler: + +``` +TerrainModernDispatcher.Draw(...) { + // Phase 1: walk visible landblocks, frustum cull + // Phase 2: build per-landblock IndirectGroupInput list + // (one entry per visible landblock — typically 25-121) + // Phase 3: upload 2 SSBOs (terrain batch data, indirect commands) + // (no per-instance buffer needed — terrain isn't instanced) + // Phase 4: bind global terrain VAO + SSBOs + // Phase 5: opaque pass ONLY — glMultiDrawElementsIndirect +} +``` + +Total ~6-8 GL calls per frame for terrain. That's the destination. + +Good luck. The split-formula decision is the only really hard call; +everything else is mechanical port work on top of N.5's substrate. +Holler at the user if anything in #51's three paths feels genuinely +ambiguous after reading the references.