spec(N.5b): design for terrain on the modern rendering path
Brainstormed 2026-05-09. Lifts outdoor terrain rendering onto N.5's modern primitives (bindless textures + glMultiDrawElementsIndirect) preserving the visible terrain pixel-for-pixel and preserving physics-vs-visual Z agreement (issue #51). Key decisions: - Path C: WB renderer pattern + acdream's existing LandblockMesh.Build (which uses retail's FSplitNESW formula, verified at retail addr 00531d10). Path A killed by 49.98% measured divergence vs retail. - Single global VBO/EBO + slot allocator (one slot per landblock), uint32 indices with baseVertex baked, mirror WB's pattern. - Keep TerrainAtlas (palCode-based fragment blending), add bindless handles. No LandSurfaceManager adoption. - Separate terrain_modern.vert/.frag (port of today's terrain.vert/.frag with bindless preamble; same blend math, same AdjustPlanes lighting). - Pure-CPU Z-conformance sentinel: meshTriZ vs TerrainSurface within 1mm across 10 representative landblocks x 100 sample points. - Acceptance: build green, conformance test passes, ~6-8 GL calls/frame for terrain regardless of scene size, [TERRAIN-DIAG] cpu_ms at radius=5 >=10% lower than today's per-LB-binds path. Files added: TerrainModernRenderer + TerrainSlotAllocator + terrain_modern.vert/.frag + 2 test files. Files deleted: TerrainChunkRenderer + TerrainRenderer + terrain.vert/.frag. Out of scope: EnvCells/dungeons, sky, particles, A.5 LOD, LandSurfaceManager adoption, fork-patching WB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
47f2cea1e8
commit
b35ddf3426
1 changed files with 438 additions and 0 deletions
|
|
@ -0,0 +1,438 @@
|
|||
# Phase N.5b — Terrain on the Modern Rendering Path — Design Spec
|
||||
|
||||
**Status:** Brainstormed 2026-05-09; not yet implemented.
|
||||
**Author:** acdream lead engineer + Claude.
|
||||
**Builds on:** Phase N.5 (`WbDrawDispatcher` on bindless + multi-draw indirect, shipped 2026-05-08).
|
||||
|
||||
**Predecessor docs (read first if you're new to this phase):**
|
||||
- [`docs/research/2026-05-09-phase-n5b-handoff.md`](../../research/2026-05-09-phase-n5b-handoff.md) — cold-start briefing.
|
||||
- [`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`](../plans/2026-05-08-phase-n5-modern-rendering.md) — N.5 plan + ship record.
|
||||
- [`docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`](2026-05-08-phase-n5-modern-rendering-design.md) — N.5 spec; the substrate N.5b consumes.
|
||||
- [`docs/ISSUES.md`](../../ISSUES.md) issue #51 — the load-bearing constraint this phase resolves.
|
||||
|
||||
---
|
||||
|
||||
## 1. Problem statement
|
||||
|
||||
N.5 lifted **entity** rendering onto bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher is 1.23 ms/frame median at Holtburg courtyard; ~810 fps sustained; ~12-15 GL calls/frame for entities regardless of scene complexity. Terrain is still on the older per-landblock pipeline (`TerrainChunkRenderer` at [src/AcDream.App/Rendering/TerrainChunkRenderer.cs](../../../src/AcDream.App/Rendering/TerrainChunkRenderer.cs)) — bind a per-chunk VAO + IBO, issue `glDrawElements` per visible chunk. At radius=2 that's ~25 GL calls/frame for terrain; at radius=5 it scales to ~121.
|
||||
|
||||
**N.5b's goal:** lift terrain rendering onto the same modern primitives N.5 just delivered, preserving the visible terrain pixel-for-pixel and preserving physics-vs-visual Z agreement (issue #51 / the cell-boundary wobble bug class).
|
||||
|
||||
The work is straightforward in shape — N.5's substrate (bindless wrapper, `DrawElementsIndirectCommand` struct, `[WB-DIAG]` instrumentation, two-phase Dispose pattern) is already built. The non-trivial decision is how to handle the formula divergence between WorldBuilder and retail.
|
||||
|
||||
---
|
||||
|
||||
## 2. The formula divergence (why Path A is dead)
|
||||
|
||||
WorldBuilder's `TerrainUtils.CalculateSplitDirection` ([references/WorldBuilder/.../TerrainUtils.cs:44-53](../../../references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44)) and acdream's `TerrainBlending.CalculateSplitDirection` ([src/AcDream.Core/Terrain/TerrainBlending.cs:56](../../../src/AcDream.Core/Terrain/TerrainBlending.cs:56)) use mathematically distinct formulas:
|
||||
|
||||
| | Formula | Source |
|
||||
|---|---|---|
|
||||
| acdream | `dw = x*y*0x0CCAC033 - x*0x421BE3BD + y*0x6C1AC587 - 0x519B8F25; bit31` | AC2D `Landblocks.cpp:346-350` |
|
||||
| WB | `(seedA + 1813693831) - seedB - 1369149221 >= 0.5` (rescaled) where `seedA = (lbX*8+cellX)*214614067; seedB = (lbY*8+cellY)*1109124029` | clean-room reverse engineering |
|
||||
|
||||
**Verified retail authority:** the named retail decomp at [`docs/research/named-retail/acclient_2013_pseudo_c.txt`](../../research/named-retail/acclient_2013_pseudo_c.txt) lines 316042-316144 (function `CLandBlockStruct::ConstructPolygons` at retail address `00531d10`) contains the constants `0x0CCAC033 / 0x6C1AC587 / 0x421BE3BD / 0x519B8F25` verbatim. **Retail uses AC2D's formula.** acdream matches retail. **WB does not.**
|
||||
|
||||
**Quantified divergence** (per `tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`, sweep across 255×255 landblocks × 64 cells = 4,161,600 cells):
|
||||
|
||||
| Comparison | Disagreement rate |
|
||||
|---|---|
|
||||
| Raw enum output (WB enum vs acdream enum) | **50.02%** |
|
||||
| Diagonal-actually-painted (post-correcting for WB's inverted enum semantics) | **49.98%** |
|
||||
| Holtburg town (0xA9B0) | 29/64 cells (45.3%) wrong if using WB |
|
||||
| Worst landblock (0x4D96) | 47/64 cells (73.4%) wrong if using WB |
|
||||
| Best landblock (0x0478) | 17/64 cells (26.6%) wrong if using WB |
|
||||
|
||||
The two formulas behave like independent random hashes. Adopting WB's pipeline wholesale (Path A) would visibly mis-render ~half the diagonals on every landblock — the cell-boundary wobble bug class would be present everywhere.
|
||||
|
||||
**Path A is dead.** N.5b commits to Path C (see Decision 1 below): use WB's *renderer* pattern (single global VBO/EBO + slot allocator + multi-draw indirect), driven by acdream's existing `LandblockMesh.Build` which uses retail's formula.
|
||||
|
||||
---
|
||||
|
||||
## 3. Decisions log
|
||||
|
||||
The eight brainstorm outcomes, locked.
|
||||
|
||||
| # | Decision | Choice | Reason |
|
||||
|---|---|---|---|
|
||||
| 1 | Formula source for cell split direction | **Path C — WB renderer pattern, acdream's `LandblockMesh.Build` + `TerrainBlending.CalculateSplitDirection`** (retail's formula) | Path A measured 49.98% diagonal-painted divergence vs retail. Path B (fork-patch WB) is permanent maintenance burden. Path C keeps a known-working asset and avoids fork friction. Same per-frame perf as either alternative. |
|
||||
| 2 | Atlas model | **Keep `TerrainAtlas` (palCode-based fragment blending) + add bindless handles** | Visual correctness already locked in. Bindless wrapper is ~50 lines, cookie-cutter from N.5's `TextureCache.MakeResidentHandle` pattern. No perf win from adopting WB's `LandSurfaceManager`. |
|
||||
| 3 | Mesh ownership | **Single global VBO/EBO + slot allocator, one slot per landblock** | Required for `glMultiDrawElementsIndirect` to actually win — per-LB IBOs would force per-LB binds, defeating the point. Mirrors N.5's pattern + WB's pattern. |
|
||||
| 4 | Index format | **uint32 + baseVertex baked into indices on upload** | Matches WB's pattern verbatim ("maximum driver compatibility"). 192 KB extra IBO at 256 slots — rounding error vs vertex bytes. Future-proofs A.5's higher radius. |
|
||||
| 5 | Shader unification | **Separate `terrain_modern.vert/.frag`** | Vertex layouts are meaningfully different (terrain: 6 attribs incl. palCode; entities: position+UV+normal+per-instance matrix). Unifying forces dead code on both sides; no perf win. |
|
||||
| 6 | Streaming integration | **Mirror WB's slot allocator (free-list `Queue<int>` + power-of-two grow). Skip WB's 15s unload delay.** | Free-list standard; grow-by-doubling matches N.5 buffer growth pattern. The 15s delay would compete with `StreamingLoader`'s existing hysteresis — let one component own lifecycle policy. |
|
||||
| 7 | Conformance test | **Pure-CPU sweep: visual mesh Z = `TerrainSurface.SampleZFromHeightmap` within 1mm, 10 representative landblocks × 100 sample points** | The exact issue #51 sentinel. ~1,000 assertions/run, <100ms, no GL infrastructure needed. Catches any silent formula or vertex-layout drift. |
|
||||
| 8 | Visual verification gate | **4 outdoor scenes (Holtburg flat + sloped, Foundry-area, sloped LB) × 6 visual checks** | Outdoor-only — interiors / dungeons / EnvCells are out of scope and not testable yet. The wobble check is the load-bearing #51 sentinel. |
|
||||
|
||||
---
|
||||
|
||||
## 4. Architecture overview
|
||||
|
||||
### Per-frame draw flow
|
||||
|
||||
```
|
||||
TerrainModernRenderer.Draw(camera, frustum, neverCullId):
|
||||
1. Walk all loaded slots → per-slot frustum cull (AABB test).
|
||||
Build _visibleSlots list (in-place reuse, no per-frame alloc).
|
||||
|
||||
2. If _visibleSlots.Count == 0: early-out.
|
||||
|
||||
3. Build per-frame DEIC array, one entry per visible slot:
|
||||
DrawElementsIndirectCommand {
|
||||
Count = 384, // verts/landblock
|
||||
InstanceCount= 1,
|
||||
FirstIndex = slot.FirstIndex, // baked offset into global IBO
|
||||
BaseVertex = 0, // already baked into indices
|
||||
BaseInstance = 0
|
||||
}
|
||||
|
||||
4. If _drawIndirectCapacity < _visibleSlots.Count:
|
||||
delete + re-allocate _indirectBuffer (power-of-two grow).
|
||||
glBufferSubData(DRAW_INDIRECT_BUFFER, 0, sizeof(DEIC) * _visibleSlots.Count, deicArray)
|
||||
|
||||
5. shader.Use() // terrain_modern
|
||||
6. Bind global VAO (_globalVao)
|
||||
7. Set bindless handle uniforms: glProgramUniformHandleARB for uTerrain + uAlpha
|
||||
8. Bind DRAW_INDIRECT_BUFFER (_indirectBuffer)
|
||||
9. glMemoryBarrier(GL_COMMAND_BARRIER_BIT)
|
||||
10. glMultiDrawElementsIndirect(Triangles, UnsignedInt, indirect=0,
|
||||
drawcount=_visibleSlots.Count, stride=sizeof(DEIC))
|
||||
11. Unbind VAO.
|
||||
|
||||
GL calls per frame for terrain: ~6-8 fixed.
|
||||
- 1× shader.Use
|
||||
- 1× BindVertexArray
|
||||
- 2× ProgramUniformHandleARB (atlas handles)
|
||||
- 1× BindBuffer for DRAW_INDIRECT_BUFFER
|
||||
- 1× BufferSubData for DEIC array
|
||||
- 1× MemoryBarrier
|
||||
- 1× MultiDrawElementsIndirect
|
||||
- 1× BindVertexArray(0)
|
||||
```
|
||||
|
||||
### Per-landblock-load flow (streaming integration)
|
||||
|
||||
```
|
||||
TerrainModernRenderer.AddLandblock(id, meshData, worldOrigin):
|
||||
1. If id already present: RemoveLandblock(id) first (replaces).
|
||||
2. Bake worldOrigin into vertex positions (CPU; ~12µs per landblock).
|
||||
3. Acquire slot:
|
||||
if _freeSlots.TryDequeue: reuse
|
||||
else: slot = _nextFreeSlot++; if needed, EnsureCapacity(_nextFreeSlot).
|
||||
4. Compute slot offsets:
|
||||
slotByteOffset_VBO = slot * 384 * 40 bytes (15,360 bytes per slot)
|
||||
slotByteOffset_IBO = slot * 384 * 4 bytes (1,536 bytes per slot)
|
||||
firstIndex = slot * 384
|
||||
baseVertex = slot * 384
|
||||
5. Bake baseVertex into indices on CPU (indices[i] += baseVertex).
|
||||
6. glBufferSubData(VBO, slotByteOffset_VBO, vertBytes, vertData).
|
||||
7. glBufferSubData(IBO, slotByteOffset_IBO, idxBytes, bakedIndices).
|
||||
8. Compute slot AABB (worldOrigin.x, worldOrigin.y, minZ, +192, +192, maxZ).
|
||||
9. Store SlotData {id, worldOrigin, firstIndex, indexCount, aabbMin, aabbMax}.
|
||||
10. _idToSlot[id] = slot.
|
||||
|
||||
TerrainModernRenderer.RemoveLandblock(id):
|
||||
1. _idToSlot.TryGetValue(id) → slot.
|
||||
2. _freeSlots.Enqueue(slot); _idToSlot.Remove(id); _slots[slot] = null.
|
||||
(No GPU clear — DEIC list won't reference unused slots.)
|
||||
|
||||
EnsureCapacity(requiredSlots):
|
||||
newCap = max(initialCapacity, currentCap * 2)
|
||||
while newCap < requiredSlots: newCap *= 2.
|
||||
Allocate new VBO + IBO at new size.
|
||||
glCopyBufferSubData old → new (preserve loaded slot data).
|
||||
Delete old; recreate VAO pointing at new VBO+IBO.
|
||||
```
|
||||
|
||||
### Relation to N.5's existing dispatcher
|
||||
|
||||
`TerrainModernRenderer` is structurally **parallel** to `WbDrawDispatcher`, not nested under it. They share:
|
||||
|
||||
- `BindlessSupport` wrapper for `ARB_bindless_texture` calls
|
||||
- `DrawElementsIndirectCommand` struct (20-byte layout)
|
||||
- `[WB-DIAG]` instrumentation pattern (CPU `Stopwatch` + GPU `GL_TIME_ELAPSED` queries)
|
||||
- `SceneLighting` UBO at binding=1
|
||||
|
||||
But they're separate dispatchers with separate global buffers, separate VAOs, separate shaders. Per frame, `GameWindow.Draw` calls them in sequence:
|
||||
|
||||
1. `_wbDrawDispatcher.Draw(...)` — entities (opaque + transparent passes)
|
||||
2. `_terrainModern.Draw(...)` — terrain (single opaque pass)
|
||||
3. Sky / particles / debug / UI on legacy paths until later phases retire them.
|
||||
|
||||
---
|
||||
|
||||
## 5. Component changes
|
||||
|
||||
### Files added
|
||||
|
||||
| File | Purpose | Approx. size |
|
||||
|---|---|---|
|
||||
| `src/AcDream.App/Rendering/TerrainModernRenderer.cs` | The new dispatcher. Owns global VBO/EBO + slot allocator + per-frame DEIC build + `glMultiDrawElementsIndirect` dispatch. | ~400-500 lines |
|
||||
| `src/AcDream.App/Rendering/TerrainSlotAllocator.cs` | Pure-CPU helper extracted for unit testing: free-list slot management + DEIC array builder. | ~150 lines |
|
||||
| `src/AcDream.App/Rendering/Shaders/terrain_modern.vert` | Vertex shader. Same per-cell layout as today's `terrain.vert` (locations 0-5). Reads bindless atlas handles via uniform. Same `SceneLighting` UBO at binding=1. Same per-vertex AdjustPlanes lighting bake. | ~150 lines |
|
||||
| `src/AcDream.App/Rendering/Shaders/terrain_modern.frag` | Fragment shader. Same `combineOverlays` + `combineRoad` + `maskBlend3` as today's `terrain.frag`. Samples bindless `sampler2DArray` handles via `GL_ARB_bindless_texture` extension. Same fog + lightning flash + atmosphere. | ~150 lines |
|
||||
| `tests/AcDream.Core.Tests/Terrain/TerrainModernConformanceTests.cs` | The Z-conformance sentinel for issue #51's bug class. ~10 representative landblocks × ~100 sample points; asserts `\|meshTriZ - TerrainSurface.SampleZFromHeightmap\| < 0.001m`. | ~150 lines |
|
||||
| `tests/AcDream.Core.Tests/Rendering/TerrainSlotAllocatorTests.cs` | Unit tests for the slot allocator (free-list correctness, capacity grow, AABB tracking) + DEIC build correctness. Pure CPU; no GL. | ~200 lines |
|
||||
|
||||
### Files modified
|
||||
|
||||
| File | Change |
|
||||
|---|---|
|
||||
| `src/AcDream.App/Rendering/TerrainAtlas.cs` | Add `GetBindlessHandles()` returning `(ulong terrain, ulong alpha)`. Mirrors N.5's `TextureCache.MakeResidentHandle` pattern: generate handle once at first call, make resident, cache. The existing `GlTexture` / `GlAlphaTexture` `uint` properties stay (no legacy callers to migrate yet, but the path is preserved). |
|
||||
| `src/AcDream.App/Rendering/GameWindow.cs` | Field declaration ([line 21](../../../src/AcDream.App/Rendering/GameWindow.cs:21)): `_terrain` field type `TerrainChunkRenderer? → TerrainModernRenderer?`. Construction ([line 1391](../../../src/AcDream.App/Rendering/GameWindow.cs:1391)): `new TerrainChunkRenderer(gl, shader, atlas)` → `new TerrainModernRenderer(gl, bindless, shader, atlas)`. Wire the `[TERRAIN-DIAG]` rollup callback (mirror the existing `[WB-DIAG]` callback wiring). |
|
||||
| `docs/plans/2026-04-11-roadmap.md` | N.5b → "Shipped" row on completion; N.6 entry refreshed to remove "terrain on modern path" from scope. |
|
||||
| `docs/ISSUES.md` | Issue #51 → "Recently closed" with the SHIP commit SHA. |
|
||||
| `CLAUDE.md` "WB integration cribs" section | Add the N.5b crib: terrain dispatcher mirror of WB's pattern, retail-formula preserved via `LandblockMesh.Build` + `TerrainBlending.CalculateSplitDirection`. |
|
||||
| `memory/project_phase_n5b_state.md` (new memory file) | Captures any high-value gotchas discovered during N.5b implementation (analogous to `project_phase_n5_state.md`'s three gotchas). |
|
||||
|
||||
### Files deleted
|
||||
|
||||
| File | Reason |
|
||||
|---|---|
|
||||
| `src/AcDream.App/Rendering/TerrainChunkRenderer.cs` (454 lines) | Replaced by `TerrainModernRenderer`. |
|
||||
| `src/AcDream.App/Rendering/TerrainRenderer.cs` (247 lines) | Older sibling — already not wired in production. Has no users. Goes away in the same commit as `TerrainChunkRenderer`. |
|
||||
| `src/AcDream.App/Rendering/Shaders/terrain.vert` (147 lines) | Replaced by `terrain_modern.vert`. |
|
||||
| `src/AcDream.App/Rendering/Shaders/terrain.frag` (149 lines) | Replaced by `terrain_modern.frag`. |
|
||||
|
||||
### Net diff
|
||||
|
||||
- Adds: ~6 files, ~1,200 lines (renderer + slot-allocator + 2 shaders + 2 test files)
|
||||
- Removes: ~4 files, ~1,000 lines (2 old renderers + 2 old shaders)
|
||||
- Net: ~+200 lines for the same visual output, with the dispatcher collapsed to ~6-8 GL calls/frame regardless of scene size
|
||||
|
||||
### Public API of `TerrainModernRenderer`
|
||||
|
||||
```csharp
|
||||
public sealed class TerrainModernRenderer : IDisposable
|
||||
{
|
||||
public TerrainModernRenderer(
|
||||
GL gl,
|
||||
BindlessSupport bindless,
|
||||
Shader terrainModernShader,
|
||||
TerrainAtlas atlas,
|
||||
int initialSlotCapacity = 64);
|
||||
|
||||
public void AddLandblock(uint landblockId, LandblockMeshData mesh, Vector3 worldOrigin);
|
||||
public void RemoveLandblock(uint landblockId);
|
||||
public void Draw(ICamera camera, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null);
|
||||
|
||||
public int LoadedSlots { get; } // for [TERRAIN-DIAG]
|
||||
public int VisibleSlots { get; } // for [TERRAIN-DIAG]
|
||||
public int CapacitySlots { get; } // for [TERRAIN-DIAG]
|
||||
|
||||
public void Dispose();
|
||||
}
|
||||
```
|
||||
|
||||
Same external interface as today's `TerrainChunkRenderer` (`AddLandblock` + `RemoveLandblock` + `Draw`). Drop-in at `GameWindow.cs:1391`.
|
||||
|
||||
---
|
||||
|
||||
## 6. Vertex format & shader
|
||||
|
||||
### Vertex format: `TerrainVertex` stays as-is (40 bytes)
|
||||
|
||||
```csharp
|
||||
[StructLayout(LayoutKind.Sequential)]
|
||||
public readonly record struct TerrainVertex(
|
||||
Vector3 Position, // 12 bytes — world-space (worldOrigin baked in by AddLandblock)
|
||||
Vector3 Normal, // 12 bytes — per-vertex from central-difference (Phase 3b)
|
||||
uint Data0, // 4 bytes — base+ovl0 tex/alpha indices
|
||||
uint Data1, // 4 bytes — ovl1+ovl2 tex/alpha indices
|
||||
uint Data2, // 4 bytes — road0+road1 tex/alpha indices
|
||||
uint Data3); // 4 bytes — rotations + splitDir bit
|
||||
// total: 40 bytes
|
||||
```
|
||||
|
||||
Already correct, already debugged. Per-vertex normal is preserved because retail bakes AdjustPlanes lighting at the vertex stage — losing it would re-introduce the "warmer / less blue than retail" regression researched in [`docs/research/2026-04-24-lambert-brightness-split.md`](../../research/2026-04-24-lambert-brightness-split.md).
|
||||
|
||||
VAO attribute layout (locations 0-5, unchanged from today's `terrain.vert`):
|
||||
|
||||
| Loc | Type | Source | Purpose |
|
||||
|---|---|---|---|
|
||||
| 0 | vec3 (3 floats) | Position offset 0 | world-space position |
|
||||
| 1 | vec3 (3 floats) | Normal offset 12 | per-vertex normal |
|
||||
| 2 | uvec4 (4 bytes) | Data0 offset 24 | base+ovl0 tex/alpha |
|
||||
| 3 | uvec4 (4 bytes) | Data1 offset 28 | ovl1+ovl2 tex/alpha |
|
||||
| 4 | uvec4 (4 bytes) | Data2 offset 32 | road0+road1 tex/alpha |
|
||||
| 5 | uvec4 (4 bytes) | Data3 offset 36 | rotations + splitDir |
|
||||
|
||||
### Shader: `terrain_modern.vert/.frag`
|
||||
|
||||
The structural change vs today's `terrain.vert/.frag` is small. The blend math, lighting bake, fog, lightning flash all stay verbatim. The only change is how textures are bound:
|
||||
|
||||
```glsl
|
||||
// terrain_modern.frag — preamble
|
||||
#version 460 core
|
||||
#extension GL_ARB_bindless_texture : require
|
||||
|
||||
uniform sampler2DArray uTerrain; // 64-bit bindless handle, set per-frame
|
||||
uniform sampler2DArray uAlpha; // 64-bit bindless handle, set per-frame
|
||||
|
||||
// SceneLighting UBO at binding=1 (unchanged from today)
|
||||
layout(std140, binding = 1) uniform SceneLighting { ... };
|
||||
|
||||
// rest is unchanged from today's terrain.frag — combineOverlays, combineRoad,
|
||||
// maskBlend3, applyFog, lightning flash are line-for-line identical
|
||||
```
|
||||
|
||||
C# side per frame:
|
||||
|
||||
```csharp
|
||||
// once at startup or first Draw, after atlas is built:
|
||||
var (terrainHandle, alphaHandle) = atlas.GetBindlessHandles();
|
||||
// MakeTextureHandleResidentARB called inside GetBindlessHandles, mirror N.5's pattern
|
||||
|
||||
// per frame:
|
||||
shader.Use();
|
||||
gl.ProgramUniformHandleARB(shader.Program, uTerrainLoc, terrainHandle);
|
||||
gl.ProgramUniformHandleARB(shader.Program, uAlphaLoc, alphaHandle);
|
||||
// ... bind global VAO + DEIC + glMultiDrawElementsIndirect
|
||||
```
|
||||
|
||||
The bindless extension makes texture access syntactically identical to today's `sampler2DArray` uniform — the only difference is *how* the sampler is set on the C# side. GLSL doesn't know it's bindless.
|
||||
|
||||
### SSBO/UBO binding map (cross-checked with N.5)
|
||||
|
||||
| Binding | Type | Owner | Used by |
|
||||
|---|---|---|---|
|
||||
| SSBO=0 | `Instances[]` (mat4) | `WbDrawDispatcher` | `mesh_modern.vert` |
|
||||
| SSBO=1 | `Batches[]` (handle+layer+flags) | `WbDrawDispatcher` | `mesh_modern.vert/.frag` |
|
||||
| **SSBO=2** | (reserved) | — | future per-batch terrain data when A.5 wants per-LB atlas variation |
|
||||
| UBO=1 | `SceneLighting` | `GameWindow` (set once/frame) | `mesh_modern.frag`, `terrain_modern.vert/.frag`, `sky.frag`, etc. |
|
||||
|
||||
N.5b doesn't introduce a new SSBO. The atlas handles are uniforms, not SSBO entries — atlas is region-wide so per-frame upload is two `uvec2`s (16 bytes), not worth the SSBO machinery. SSBO=2 stays available for future per-batch terrain data.
|
||||
|
||||
### What's preserved bit-for-bit from today's shaders
|
||||
|
||||
- `unpackOverlayLayer(...)` (rotation logic for overlays)
|
||||
- The `gl_VertexID % 6 → corner` table for both SWtoNE and SEtoNW splits (the geometry mapping that was debugged 2026-04-21 to match ACE's `ConstructPolygons`)
|
||||
- `MIN_FACTOR = 0.0` for the AdjustPlanes Lambert floor (the brightness research)
|
||||
- `combineOverlays` + `combineRoad` + `maskBlend3` fragment math
|
||||
- `applyFog` distance-blend
|
||||
- Lightning flash additive overlay
|
||||
- Per-vertex sun + ambient bake into `vLightingRGB`
|
||||
|
||||
---
|
||||
|
||||
## 7. Conformance + verification
|
||||
|
||||
### CPU unit tests (no GL required)
|
||||
|
||||
**`tests/AcDream.Core.Tests/Rendering/TerrainSlotAllocatorTests.cs`** — exercises the dispatcher's pure-CPU pieces in isolation:
|
||||
|
||||
| Test | Asserts |
|
||||
|---|---|
|
||||
| `Add_FirstLandblock_GetsSlotZero` | `_nextFreeSlot` starts at 0; first add uses slot 0 |
|
||||
| `Add_SecondLandblock_GetsSlotOne` | Sequential adds use sequential slots |
|
||||
| `RemoveThenAdd_ReusesFreedSlot` | Free-list FIFO: remove slot 0, add new LB → slot 0 again |
|
||||
| `Add_BeyondInitialCapacity_DoublesCapacity` | After 64 adds, 65th triggers grow to 128 |
|
||||
| `AddSameId_ReplacesExistingSlot` | Re-adding an LB id replaces in same slot (no leak) |
|
||||
| `Build_DeicArray_VisibleSlotsOnly` | DEIC array has one entry per visible slot, `firstIndex = slot * 384`, `count = 384` |
|
||||
| `Build_DeicArray_EmptyVisible` | No visible → empty array |
|
||||
| `Aabb_StoredFromWorldOrigin` | Slot's AABB is `(origin.x, origin.y, minZ)..(origin.x+192, origin.y+192, maxZ)` |
|
||||
|
||||
**`tests/AcDream.Core.Tests/Terrain/TerrainModernConformanceTests.cs`** — the Z-conformance sentinel for issue #51's bug class.
|
||||
|
||||
Pattern modeled on the existing `ClientConformanceTests.cs`. For each landblock:
|
||||
|
||||
1. Load real dat heightmap data (10 representative landblocks: Holtburg flat 0xA9B0, Holtburg sloped 0xA9B1, Foundry 0x8080, Cragstone 0xCB99, Direlands sample 0xC040, plus 5 randomly-chosen sloped landblocks from a fixed seed for variety).
|
||||
2. Build mesh via `LandblockMesh.Build(...)` (the source-of-truth generator that `TerrainModernRenderer` calls internally).
|
||||
3. For 100 (localX, localY) sample points uniformly distributed in `[0, 192] × [0, 192]`:
|
||||
- Compute `meshTriZ`: find the triangle in the built mesh containing the point, barycentric-interpolate Z from its three vertex Zs.
|
||||
- Compute `physicsZ = TerrainSurface.SampleZFromHeightmap(heights, heightTable, lbX, lbY, localX, localY)`.
|
||||
- Assert `|meshTriZ - physicsZ| < 0.001m` (1 mm tolerance — well below visible threshold).
|
||||
4. Total: 10 landblocks × 100 points = 1,000 assertions per run; runs in <100 ms.
|
||||
|
||||
If this test fires, the pipeline has silently drifted (different formula somewhere, swapped vertex order, baseVertex baked wrong, etc.) — the exact bug class issue #51 names.
|
||||
|
||||
### Existing tests stay green
|
||||
|
||||
| Test file | Proves | N.5b impact |
|
||||
|---|---|---|
|
||||
| `TerrainBlendingTests.cs` | `CalculateSplitDirection` returns retail's formula | unchanged — still passes |
|
||||
| `LandblockMeshTests.cs` | `LandblockMesh.Build` produces correct triangles | unchanged — still passes |
|
||||
| `ClientConformanceTests.cs` | Existing conformance sweep | unchanged — still passes |
|
||||
| `SplitFormulaDivergenceTest.cs` | WB↔retail divergence is real (49.98%) | unchanged — runs as data documentation; passes |
|
||||
| All 71 tests in N.5 filter (Wb+MatrixComposition+TextureCacheBindless) | N.5 ship intact | unchanged — terrain is a separate dispatcher |
|
||||
|
||||
### `[TERRAIN-DIAG]` instrumentation
|
||||
|
||||
A new dedicated `[TERRAIN-DIAG]` log line, parallel to the existing `[WB-DIAG]` line, so terrain perf is observable independent of entity perf. Two parallel dispatchers, two parallel diag lines:
|
||||
|
||||
```
|
||||
[TERRAIN-DIAG] cpu_ms=avg/95th draws=N/frame visible=N loaded=N capacity=N
|
||||
```
|
||||
|
||||
- `cpu_ms` — `Stopwatch` around `TerrainModernRenderer.Draw`. Median + 95th percentile over the 5-second rollup window.
|
||||
- `draws` — DEIC drawcount param (number of visible landblocks dispatched per `glMultiDrawElementsIndirect` call). Should be 6-8 GL calls fixed per frame regardless of `draws` value.
|
||||
- `visible` / `loaded` / `capacity` — slot accounting; for spotting growth or leaks.
|
||||
- `gpu_ms` — `GL_TIME_ELAPSED` query around the indirect dispatch. Same double-buffering caveat as N.5 (deferred to N.6 perf polish; will report `0/0` until then).
|
||||
|
||||
### Visual verification gate (user runs the client)
|
||||
|
||||
**Scenes** (drive the character through each):
|
||||
1. **Holtburg town** (~0xA9B0 area) — flat terrain + roads
|
||||
2. **Holtburg sloped landblock** (~0xA9B1) — slopes + cell-boundary diagonal transitions
|
||||
3. **Foundry-area** (~0x80xx) — different blend palette
|
||||
4. **Any visibly-sloped outdoor landblock** — Direlands or wherever you regularly test slope behavior
|
||||
|
||||
**Checks** at each scene:
|
||||
1. **No cell-boundary wobble** — the load-bearing #51 sentinel
|
||||
2. **No missing chunks / black holes** — slot allocator or DEIC misalignment
|
||||
3. **No texture seams at landblock edges** — pre-N.5b regression check
|
||||
4. **No z-fighting** — pre-N.5b regression check
|
||||
5. **`[TERRAIN-DIAG] draws=N` ~6-8 GL calls/frame regardless of N**
|
||||
6. **`[TERRAIN-DIAG] cpu_ms` at radius=5 is ≥10% lower** than the pre-N.5b baseline (recorded in `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`)
|
||||
|
||||
Acceptance: all six checks pass in all four scenes. **Outdoor-only — interiors / dungeons / EnvCells are out of scope and not testable yet**.
|
||||
|
||||
---
|
||||
|
||||
## 8. Acceptance criteria
|
||||
|
||||
1. Build green; existing tests stay green; new conformance test passes (`|deltaZ| < 1mm` across the sweep).
|
||||
2. Visual identity to today confirmed at the four user-verification scenes.
|
||||
3. `[TERRAIN-DIAG]` shows terrain at ~6-8 GL calls/frame regardless of scene size (vs today's 25-121).
|
||||
4. No cell-boundary wobble at any visited landblock (the #51 sentinel).
|
||||
5. **CPU dispatcher time at radius=5 ≥10% lower** than today's `TerrainChunkRenderer` per-LB-binds path. Measured via the `[TERRAIN-DIAG] cpu_ms` median over a 5-second rollup at the Holtburg test scene with radius=5; before/after numbers captured into `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` (mirror N.5's perf baseline doc convention).
|
||||
6. Issue #51 closed in `docs/ISSUES.md` with the SHIP commit SHA.
|
||||
|
||||
---
|
||||
|
||||
## 9. Out-of-scope (explicit boundaries)
|
||||
|
||||
N.5b does **not** ship any of these. Each is a separate phase or backlog item:
|
||||
|
||||
- **EnvCells / interior cells / dungeons** — different mesh source (cell-bound static geometry, not heightmap). Future phase, not currently scoped on the roadmap.
|
||||
- **Sky rendering** (`SkyRenderer.cs`) — N.8 territory.
|
||||
- **Particle rendering** (`ParticleRenderer.cs`) — N.8 territory.
|
||||
- **Two-tier streaming + horizon LOD** (A.5) — separate brainstorm. Different streaming primitive (visible window split into "near tier" full-detail and "far tier" coarse-LOD). N.5b deliberately doesn't touch streaming radius or LOD machinery.
|
||||
- **WB's `LandSurfaceManager` adoption** — Decision 2 explicitly keeps `TerrainAtlas`. Revisit only if a specific feature requires per-landblock alpha-mask bake.
|
||||
- **WB's `TerrainGeometryGenerator` adoption** — Path C explicitly keeps acdream's `LandblockMesh.Build` as the source of truth. Don't call into WB's generator.
|
||||
- **Fork-patching WB upstream** — Path C avoids this entirely. The WB submodule stays clean.
|
||||
- **Persistent-mapped buffers / GPU-side culling / GL_TIME_ELAPSED double-buffering** — N.6 perf polish territory; not in N.5b scope.
|
||||
- **Per-instance terrain "highlight" or per-LB tint** — no analogue need today; defer to backlog if a use case appears.
|
||||
- **Removing `Texture2D` / `sampler2D` legacy texture path** — N.6 cleanup once Sky/Terrain/Debug/particle paths all migrate. N.5b only adds the `Texture2DArray` bindless path; legacy stays for non-terrain consumers.
|
||||
- **Visual changes** — terrain renders pixel-for-pixel identical to today (same vertex layout, same blend math, same lighting bake). The phase is purely a dispatch-mechanism upgrade. Any visible diff means a bug, not a feature.
|
||||
|
||||
---
|
||||
|
||||
## 10. Implementation guidance
|
||||
|
||||
The phase is sized at ~1 week. Tasks decompose into ~10 mostly-parallel chunks:
|
||||
|
||||
1. **`TerrainAtlas` bindless extension** — add `GetBindlessHandles()` method. ~50 lines. Independent of dispatcher.
|
||||
2. **`TerrainSlotAllocator`** — pure-CPU helper class. ~150 lines. Independent of GL.
|
||||
3. **`TerrainSlotAllocatorTests`** — unit tests for #2. ~200 lines. Depends on #2.
|
||||
4. **`terrain_modern.vert`** — port of today's `terrain.vert` with bindless preamble. ~150 lines. Independent.
|
||||
5. **`terrain_modern.frag`** — port of today's `terrain.frag` with bindless preamble. ~150 lines. Independent.
|
||||
6. **`TerrainModernRenderer`** — dispatcher class wiring slot allocator + GL state + bindless handle uniforms + DEIC dispatch. ~400 lines. Depends on #1, #2.
|
||||
7. **`TerrainModernConformanceTests`** — Z-conformance sentinel. ~150 lines. Depends on `LandblockMesh.Build` (existing).
|
||||
8. **`GameWindow` integration** — swap `TerrainChunkRenderer` → `TerrainModernRenderer` at field+construction; add `[TERRAIN-DIAG]` rollup. ~30 lines. Depends on #6.
|
||||
9. **Delete legacy** — `TerrainChunkRenderer.cs`, `TerrainRenderer.cs`, `terrain.vert`, `terrain.frag`. Depends on #8 working in production.
|
||||
10. **Roadmap + ISSUES.md + memory** — close issue #51, update CLAUDE.md "WB integration cribs", write `memory/project_phase_n5b_state.md`. Depends on #8 + visual verification.
|
||||
|
||||
Tasks 1, 2, 4, 5, 7 can land in parallel. Task 6 depends on 1+2. Task 8 depends on 6. Tasks 9 and 10 are post-verification cleanup.
|
||||
|
||||
The plan document (next step after this spec) breaks each task into TDD-style subtasks with clear acceptance gates per subagent dispatch.
|
||||
Loading…
Add table
Add a link
Reference in a new issue