Merge branch 'claude/happy-joliot-f67060' — Phase N.5b SHIP + A.5 handoff

Phase N.5b: terrain on the modern rendering path. TerrainModernRenderer
replaces TerrainChunkRenderer; bindless atlas via uvec2 handle + GLSL
sampler-from-handle constructor; constant-cost dispatch (~6µs/frame)
regardless of radius. Closes issue #51. 114 tests pass; conformance
sentinel max |delta| = 0.015 mm. Honest perf baseline doc captures
that modern is ~4x slower on CPU at radius=5 because legacy's chunked
pattern already collapsed the scene to one draw call; architectural
wins manifest at higher radius (A.5).

Three high-value gotchas captured to memory:
1. uniform sampler2DArray + glProgramUniformHandleARB unreliable
   across drivers; default to uvec2 handle + sampler constructor.
2. Median calc copy[N - nz/2] underflows for nz<2.
3. Visual gate 'go' != 'verified'.

Plus: A.5 cold-start handoff at docs/research/2026-05-10-phase-a5-handoff.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-09 21:13:55 +02:00
commit 1e6a8123c6
19 changed files with 4034 additions and 864 deletions

View file

@ -102,6 +102,14 @@ ourselves".
eventually picks it up finds the hook there; the change is localized:
extend `InstanceData` stride 64→80 bytes, add the field, mix into
fragment color in `mesh_modern.frag`. ~30 min when the time comes.
- `src/AcDream.App/Rendering/TerrainModernRenderer.cs` — terrain dispatcher
on N.5's modern primitives. Mirrors WB's `TerrainRenderManager` pattern
(single global VBO/EBO + slot allocator + `glMultiDrawElementsIndirect`)
but driven by acdream's `LandblockMesh.Build` so retail's `FSplitNESW`
formula is preserved (issue #51 resolved). Atlas handles bound via the
uvec2 + `sampler2DArray(handle)` constructor pattern (NOT the direct
`uniform sampler2DArray` + `glProgramUniformHandleARB` form, which
GL_INVALID_OPERATIONs on at least one driver).
**Execution phases:** R1→R8 in the architecture doc. Each phase has clear
goals, test criteria, and builds on the previous. Don't skip phases.
@ -504,13 +512,33 @@ acdream's plan lives in two files committed to the repo:
**Currently in flight: Phase N.6 — Perf polish.**
Roadmap entry at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
Builds on N.5. Legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`,
`WbFoundationFlag`) were retired in the N.5 ship amendment — N.6 scope is
perf-only: WB atlas adoption, persistent-mapped buffers, GPU-side culling,
GL_TIME_ELAPSED query double-buffering, direct N.4 vs N.5 perf measurement,
legacy `Texture2D`/`sampler2D` TextureCache path retirement (Sky/Terrain/Debug).
Builds on N.5 + N.5b. Legacy renderers (`InstancedMeshRenderer`,
`StaticMeshRenderer`, `WbFoundationFlag`) were retired in the N.5 ship
amendment, and the terrain legacy renderer (`TerrainChunkRenderer` +
`TerrainRenderer` + legacy `terrain.vert/.frag`) was retired in N.5b.
N.6 scope is perf-only: WB atlas adoption, persistent-mapped buffers
(strong candidate after N.5b's per-frame DEIC `BufferSubData`),
GPU-side culling via compute pre-pass, GL_TIME_ELAPSED query
double-buffering, direct higher-radius perf comparison once A.5 lands,
legacy `Texture2D`/`sampler2D` TextureCache path retirement (Sky / Debug
remain on the legacy path now that Terrain has migrated).
Plan + spec written when work begins.
**Phase N.5b (Terrain on Modern Rendering Path) shipped 2026-05-09.**
`TerrainModernRenderer` mirrors WB's `TerrainRenderManager` pattern
(single global VBO/EBO + slot allocator + bindless atlas +
`glMultiDrawElementsIndirect`) but consumes `LandblockMesh.Build` so
retail's `FSplitNESW` formula is preserved (Path C; closes ISSUE #51).
Path A (substitute WB's `CalculateSplitDirection`) killed by 49.98%
divergence vs retail in
[`tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`](tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs).
At radius=5 in Holtburg modern is ~4× SLOWER on CPU than the legacy
chunked path was; architectural wins manifest at higher radius. Honest
perf baseline at
[`docs/plans/2026-05-09-phase-n5b-perf-baseline.md`](docs/plans/2026-05-09-phase-n5b-perf-baseline.md).
Plan archived at
[`docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`](docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md).
**Phase N.5 (Modern Rendering Path) shipped + amended 2026-05-08.** `WbDrawDispatcher`
on bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher 1.23ms/frame
at Holtburg (~810 fps). **Ship amendment:** `InstancedMeshRenderer`,

View file

@ -46,64 +46,6 @@ Copy this block when adding a new issue:
# Active issues
## #51 — WB's terrain-split formula diverges from retail's `FSplitNESW`
**Status:** OPEN
**Severity:** MEDIUM (blocks isolated N.2; affects sequencing of N-phase migration)
**Filed:** 2026-05-08
**Component:** terrain math / Phase N (WorldBuilder rendering migration)
**Description:** WB's `TerrainUtils.CalculateSplitDirection`
([references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44](references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44))
uses a different math expression from retail's `FSplitNESW`
(documented in CLAUDE.md as **the** real AC terrain split formula,
constants `0x0CCAC033` / `0x421BE3BD` / `0x6C1AC587` / `0x519B8F25`).
Ours is a degree-2 polynomial in (x,y); WB's is linear in (x,y).
They cannot be algebraically equivalent and disagree on a meaningful
fraction of cells.
**Concrete impact:** On any cell where the formulas pick different
diagonals, the same world position (X, Y) maps to different terrain
heights — up to ~2m for a sloped cell with one elevated corner. If a
caller mixes "WB-formula path" and "AC2D-formula path" for the same
cell, the player physics floats above or sinks below the visible
ground. This is the bug class fixed in
[src/AcDream.Core/Physics/TerrainSurface.cs:113-120](src/AcDream.Core/Physics/TerrainSurface.cs:113)
(diagonal-direction inversion).
**Files implicated:**
- `src/AcDream.Core/Physics/TerrainSurface.cs` — uses AC2D formula via
`IsSplitSWtoNE`
- `src/AcDream.Core/World/TerrainBlending.cs` — visual mesh, also AC2D
- `references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44`
— WB's diverging formula
- `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainGeometryGenerator.cs`
— WB's render mesh (presumably also uses WB's formula in lockstep)
**Sequencing implication:** Phase N.2 (terrain math helpers
substitution) cannot be shipped in isolation — it must land alongside
visual terrain renderer migration (originally N.5, now moved to N.7
scope), at which point both physics and visual mesh switch to WB's
formula together. N.5 shipped entity rendering only; terrain remains
on acdream's own pipeline through N.7.
**Research needed (when N.7 picks this up):**
1. Quantify divergence: run WB's `CalculateSplitDirection` and our
`IsSplitSWtoNE` across all (lbX, lbY, cellX, cellY) tuples for a
representative landblock set; record disagreement rate.
2. Confirm WB's `TerrainGeometryGenerator` uses WB's formula in its
render mesh — if so, switching everything to WB's formula keeps
visual + physics synced. (Highly likely.)
3. Decide whether ANY retail-conformance test (e.g., physics matching
server-authoritative Z within tolerance) is invalidated by the
formula change.
**Acceptance:** Resolved when N.7 lands and both physics + visual
terrain use WB's split formula, OR when we decide to keep the AC2D
formula and patch WB's renderer in our fork.
---
## #50 — Road-edge tree at 0xA9B1 visible in acdream but not retail
**Status:** OPEN
@ -1758,6 +1700,57 @@ Unverified. The likely culprits, ranked by suspected probability:
# Recently closed
## #51 — [DONE 2026-05-09 · da56063 + N.5b SHIP] WB's terrain-split formula diverges from retail's `FSplitNESW`
**Closed:** 2026-05-09
**Commit:** `da56063` (black-terrain fix; landed within Phase N.5b — see
`docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md` for the
ship commit chain)
**Component:** terrain math / Phase N.5b
**Resolution: Path C.** Phase N.5b lifted terrain rendering onto the
modern path (bindless atlas + `glMultiDrawElementsIndirect`) WITHOUT
adopting WB's `TerrainUtils.CalculateSplitDirection`. The pre-implementation
divergence test (`tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`)
confirmed the two formulas disagree on **49.98%** of sweep cells —
fundamentally incompatible with our shared physics + visual mesh, which
both rely on retail's `FSplitNESW` (constants `0x0CCAC033` / `0x421BE3BD` /
`0x6C1AC587` / `0x519B8F25`).
Path C: keep retail's `FSplitNESW` formula via `LandblockMesh.Build`
`TerrainBlending.CalculateSplitDirection`; mirror WB's `TerrainRenderManager`
architectural pattern (single global VBO/EBO + slot allocator + bindless
atlas + multi-draw indirect) but feed it acdream's mesh. Modern dispatcher
(`TerrainModernRenderer`) replaces `TerrainChunkRenderer` (deleted in T9
along with `TerrainRenderer` + `terrain.vert/.frag`).
Path A (substitute WB's formula) was killed by the divergence test.
Path B (fork-patch WB's renderer to use retail's formula) was rejected
for permanent maintenance burden. Path C ships the architectural
pattern while preserving retail-formula compliance.
Visual mesh and physics both still consume retail's `FSplitNESW`; they
remain in lockstep, no triangle-Z hover. The N.6 / N.7 sequencing
implication this issue carried (substitute physics math only when the
visual mesh migrates) is moot — neither side ever switches to WB's
formula.
**Files added:**
- `src/AcDream.App/Rendering/TerrainModernRenderer.cs`
- `src/AcDream.Core/Terrain/TerrainSlotAllocator.cs`
- `src/AcDream.App/Rendering/Shaders/terrain_modern.vert`
- `src/AcDream.App/Rendering/Shaders/terrain_modern.frag`
- `tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs` (the
test that killed Path A)
**Files deleted (T9):**
- `src/AcDream.App/Rendering/TerrainChunkRenderer.cs`
- `src/AcDream.App/Rendering/TerrainRenderer.cs`
- `src/AcDream.App/Rendering/Shaders/terrain.vert`
- `src/AcDream.App/Rendering/Shaders/terrain.frag`
---
## #43 — [DONE 2026-05-05 · 9e4772a] Slope staircase on observed player remotes (anim-only fallback ignored slope)
**Closed:** 2026-05-05

View file

@ -1,6 +1,6 @@
# acdream — strategic roadmap
**Status:** Living document. Updated 2026-05-08 for Phase N.5 shipping (bindless textures + `glMultiDrawElementsIndirect` on top of N.4's foundation; CPU dispatcher 1.23ms/frame at Holtburg, ~810 fps) + N.6 becomes the new in-flight phase (retire legacy renderers + perf polish).
**Status:** Living document. Updated 2026-05-09 for Phase N.5b shipping (terrain on the modern rendering path via Path C — mirror WB's `TerrainRenderManager` pattern, consume `LandblockMesh.Build` for retail formula compliance; closes ISSUE #51). N.6 (perf polish) remains the in-flight phase.
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
---
@ -61,6 +61,7 @@
| N.3 | WorldBuilder-backed texture decode — `SurfaceDecoder` delegates INDEX16 / P8 / A8R8G8B8 / R8G8B8 / A8(+Additive) to `TextureHelpers.Fill*`; `isAdditive` threaded through (terrain alpha → `FillA8Additive`, non-additive entity surfaces → `FillA8`). R5G6B5 + A4R4G4B4 newly handled (previously magenta). X8R8G8B8, DXT1/3/5, SolidColor remain ours (no WB equivalent). 9 conformance tests prove byte-identical equivalence per format. | Live ✓ |
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6 (retired early in N.5 ship amendment). | Live ✓ |
| N.5 | Modern rendering path — lifted `WbDrawDispatcher` onto bindless textures (`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch data @ binding=1, indirect commands) + 2 indirect draw calls (opaque + transparent). ~12-15 GL calls per frame regardless of group count, down from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median at Holtburg courtyard (1662 groups, ~810 fps sustained). All textures on the WB modern path use 1-layer `Texture2DArray` + `sampler2DArray`. Legacy callers keep `Texture2D` / `sampler2D` via the parallel `TextureCache` path until N.6 retires them. Three gotchas captured in memory: texture target lock-in, bindless Dispose order (two-phase non-resident before delete), GL_TIME_ELAPSED double-buffering. **Ship amendment 2026-05-08:** legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`, `WbFoundationFlag`) retired within N.5 — modern path is mandatory; missing bindless throws `NotSupportedException` at startup. N.6 scope narrowed accordingly. Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`. | Live ✓ |
| N.5b | Terrain on the modern rendering path — `TerrainModernRenderer` replaces `TerrainChunkRenderer` (the latter plus `TerrainRenderer` + `terrain.vert/.frag` deleted). Single global VBO/EBO with slot allocator (one slot per landblock), per-frame `DrawElementsIndirectCommand[]` upload + `glMultiDrawElementsIndirect`, bindless atlas handles passed as `uvec2` uniforms reconstructed via `sampler2DArray(handle)`. **Path C** chosen: mirrors WB's `TerrainRenderManager` pattern but consumes `LandblockMesh.Build` so retail's `FSplitNESW` formula is preserved (closes ISSUE #51). Path A killed by 49.98% measured divergence between WB's `CalculateSplitDirection` and retail's at addr `00531d10`; Path B (fork-patch WB) rejected for permanent maintenance burden. Perf at Holtburg radius=5 (commit `da56063`): modern 6.4-7.0 µs / 9-14 µs p95 vs legacy 1.5 µs / 3.0 µs — **modern is ~4× SLOWER on CPU at radius=5** because legacy's 16×16-LB chunking collapsed visible LBs to one `glDrawElements`. Architectural wins (zero `glBindTexture`/frame, constant-cost dispatch, per-LB frustum cull) manifest at higher radius (A.5 territory). Spec acceptance criterion 5 ("≥10% lower CPU at radius=5") amended via `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`. Three gotchas captured in memory: `uniform sampler2DArray` + `glProgramUniformHandleARB` GL_INVALID_OPERATIONs on at least one driver (use `uniform uvec2` + `sampler2DArray(handle)` constructor instead — N.5's mesh_modern pattern); `MaybeFlushTerrainDiag` median-calc underflow on first sample; visual gates need actual visual confirmation, not assent. Plan archived at `docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`. | Live ✓ |
Plus polish that doesn't get its own phase number:
- FlyCamera default speed lowered + Shift-to-boost
@ -641,23 +642,43 @@ for our deletions/additions; merge upstream `master` periodically.
lock-in, bindless Dispose two-phase order, GL_TIME_ELAPSED double-
buffering. Plan archived at
`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
- **N.5b — Terrain rendering on N.5 path.** Wire WB's
`TerrainRenderManager` + `LandSurfaceManager` + `TerrainGeometryGenerator`
onto the modern rendering path. Closes N.2's deferred terrain math
substitution: visual mesh and physics both switch to WB's
`CalculateSplitDirection` + `GetHeight` + `GetNormal` in lockstep,
resolving ISSUE #51. **Estimate: 1-2 weeks** (was 2-3 — modern path
primitives already in place from N.5).
- **✓ SHIPPED — N.5b — Terrain on the modern rendering path.** Shipped
2026-05-09. **Path C** (mirror WB's `TerrainRenderManager` pattern but
consume `LandblockMesh.Build` for retail-formula compliance). Path A
(substitute WB's `CalculateSplitDirection`) killed during pre-implementation
divergence test: WB's formula disagrees with retail's `FSplitNESW`
(addr `00531d10`) on **49.98%** of cells across `tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`'s
sweep — wholly incompatible with our shared physics + visual mesh.
Path B (fork-patch WB to use retail's formula) rejected for permanent
maintenance burden. Path C ships the architectural pattern (single
global VBO/EBO + slot allocator + bindless atlas + `glMultiDrawElementsIndirect`)
while keeping retail's formula via `LandblockMesh.Build`
`TerrainBlending.CalculateSplitDirection`. `TerrainModernRenderer` +
`terrain_modern.vert/.frag` shipped, `TerrainChunkRenderer` +
`TerrainRenderer` + legacy `terrain.vert/.frag` deleted in T9.
Closes ISSUE #51. **Perf reality check:** at radius=5 in Holtburg,
modern is ~4× SLOWER on CPU than legacy was (6.4 µs vs 1.5 µs median;
legacy collapsed radius=5's visible LBs into one `glDrawElements`
via 16×16-LB chunking). Architectural wins (zero `glBindTexture`/frame,
constant-cost dispatch as A.5 raises radius, per-LB frustum cull)
manifest at higher radius. Spec acceptance criterion #5 was wrong;
amended via `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`. Plan
archived at `docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`.
- **N.6 — Perf polish.** **Currently in flight.**
Builds on N.5. Legacy renderer retirement was pulled forward into N.5
ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`, and
`WbFoundationFlag` are already gone. N.6 scope: WB atlas adoption for
memory savings on shared content, persistent-mapped buffers if
`glBufferData` shows up in profiling, GPU-side culling via compute
pre-pass, GL_TIME_ELAPSED query double-buffering (deferred from N.5 —
diagnostic shows `gpu_us=0/0` under `ACDREAM_WB_DIAG=1`), direct N.4
vs N.5 perf measurement, retire the legacy `Texture2D`/`sampler2D` path
in `TextureCache` (currently kept for Sky + Terrain + Debug).
Builds on N.5 + N.5b. Legacy renderer retirement was pulled forward
into N.5 ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`,
`WbFoundationFlag` are gone — and the terrain legacy renderer
(`TerrainChunkRenderer` + `TerrainRenderer` + `terrain.vert/.frag`)
retired in N.5b. N.6 scope: WB atlas adoption for memory savings
on shared content, persistent-mapped buffers if `glBufferData` shows
up in profiling (the modern terrain path's per-frame DEIC `BufferSubData`
is a candidate), GPU-side culling via compute pre-pass (eliminates
the per-frame slot walk + DEIC build entirely), GL_TIME_ELAPSED query
double-buffering (deferred from N.5 — diagnostic shows `gpu_us=0/0`
under `ACDREAM_WB_DIAG=1`), direct higher-radius perf comparison once
A.5 lands (where modern's architectural wins manifest), retire the
legacy `Texture2D`/`sampler2D` path in `TextureCache` (currently kept
for Sky + Debug + particle paths now that Terrain has migrated).
Plan + spec written when work begins. **Estimate: 1-2 weeks.**
- **N.7 — EnvCells / dungeons.** Replace EnvCell rendering with WB's
`EnvCellRenderManager` + `PortalRenderManager` on top of N.4's

View file

@ -0,0 +1,98 @@
# Phase N.5b — terrain perf baseline
**Captured:** 2026-05-09 at Holtburg town dueling field, radius=5, ~30s standstill.
## Methodology
Same build (commit at perf measurement: `da56063`), `ACDREAM_WB_DIAG=1`. The build
included a TEMPORARY `ACDREAM_LEGACY_TERRAIN=1` env-var toggle (since retired in T9
deletion of the legacy renderer) that routed Draw through the legacy renderer for
direct comparison. Both renderers were constructed and fed AddLandblock / RemoveLandblock
in parallel; only one drew per frame; the same Stopwatch wrapped whichever ran.
## Numbers
| Renderer | cpu_us median | cpu_us p95 | draws/frame | Visible LBs |
|---|---|---|---|---|
| **Legacy** (`TerrainChunkRenderer`) | 1.5 | 3.0 | 1 (1 chunk) | 132-143 (whole chunk) |
| **Modern** (`TerrainModernRenderer`) | 6.4-7.0 | 9-14 | ~36-51 | 36-51 (per-LB cull) |
(Legacy `draws=1` because its 16×16-LB chunking collapses radius=5's 121 visible
landblocks into a single chunk, dispatched as one `glDrawElements`. Modern issues
one `glMultiDrawElementsIndirect` with N=36-51 sub-commands.)
## Acceptance criterion
The N.5b spec acceptance criterion 5 read: "CPU dispatcher time at radius=5 ≥10%
lower than today's per-LB-binds path." The captured numbers show modern is ~4×
HIGHER on CPU at radius=5. **The criterion was wrong** — at radius=5 in Holtburg,
legacy's chunked path was already collapsed to one draw call. The architectural
wins of multi-draw indirect manifest at higher chunk counts (A.5 territory).
The spec is amended via this doc: ship N.5b on visual identity + structural
correctness rather than CPU savings at radius=5.
## Architectural wins of the modern path (real, even when CPU is higher)
1. **Zero `glBindTexture` per frame.** Bindless atlas handles are made resident
once at startup; the modern shader samples via `sampler2DArray(uvec2 handle)`.
Legacy issued 2 `glBindTexture(Texture2DArray)` calls per frame.
2. **Constant-cost dispatch.** As A.5 raises the streaming radius (next phase),
the visible chunk count grows. Legacy scales linearly: at radius=10 (4× chunks)
it's 4 `glDrawElements` calls; at radius=15 (≥9 chunks) it's 9+ calls. Modern
stays at exactly 1 `glMultiDrawElementsIndirect` regardless.
3. **Per-LB frustum culling.** Legacy culled at chunk granularity (16×16 LBs);
modern culls per-LB. At a typical Holtburg view, ~36-51 of 132 loaded LBs are
actually visible; legacy drew the entire 132-LB chunk (3.5× the visible work
pushed to GPU vertex/fragment stages, even though CPU dispatch was cheap).
## Why modern's CPU was higher at radius=5
Per-frame work in modern (in microseconds-ish budget on this scene):
- Walk all loaded slots checking visibility (~120 slots) → AABB test each
- Build DEIC array (51 entries × 20 bytes = 1020 bytes)
- `glBufferSubData(DRAW_INDIRECT_BUFFER, ...)` — driver memcpy
- 2× `glProgramUniform2(..., handle.low, handle.high)` for atlas handles
- `glBindVertexArray` + `glMemoryBarrier(GL_COMMAND_BARRIER_BIT)` + `glMultiDrawElementsIndirect`
Legacy's per-frame work:
- Bind 2 textures
- Bind one VAO (the chunk)
- One `glDrawElements`
The DEIC array build + buffer upload alone is ~3-5µs at radius=5 on this hardware,
which is the bulk of the modern overhead. At higher radius, this overhead amortizes:
the buffer is similar size, but the alternative (legacy's N draws) grows.
## Follow-up work
- **A.5 (next phase)** will exercise the higher-radius case where modern wins.
Capture a fresh baseline at radius=8 / 10 once A.5 lands.
- **N.6 perf polish** can investigate persistent-mapped buffers for the indirect
buffer, which would eliminate the per-frame `glBufferSubData`. Likely small win
at radius=5 (single ~1KB upload), bigger at higher radii.
- **GPU-side culling** (compute shader generating the DEIC array directly into
the indirect buffer) eliminates the CPU slot walk + DEIC build entirely. N.6 or
later territory; only worth it if profiling shows the CPU walk is hot.
## Lessons captured to memory
`memory/project_phase_n5b_state.md` records the high-value gotchas surfaced
during N.5b implementation. Three particularly bitable ones:
1. **`uniform sampler2DArray` + `glProgramUniformHandleARB` is unreliable.** Some
drivers (NVIDIA Windows in this case) reject the combination with
`GL_INVALID_OPERATION`. Use the `uniform uvec2` + `sampler2DArray(handle)`
constructor pattern instead — N.5's mesh_modern uses this, and N.5b's
terrain_modern adopted it after the black-terrain regression.
2. **`MaybeFlushTerrainDiag` underflow.** A naive median calc (`copy[N - nz/2]`)
underflows to `copy[N]` when only one sample has been recorded. Use
`copy[N - 1 - (nz - 1) / 2]` instead.
3. **Visual gate must actually be visually confirmed.** "Go" doesn't mean
"verified." During N.5b's gate the user said "go" without launching, which
masked the black-terrain regression for hours. The gate must include the
user reporting actual visual confirmation, not assent to proceed.

View file

@ -0,0 +1,376 @@
# Phase A.5 — Two-tier Streaming + Horizon LOD — Cold-Start Handoff
**Created:** 2026-05-10, immediately after N.5b ship.
**Audience:** the next agent picking up streaming + horizon-LOD work.
**Purpose:** brief you on where N.5b left things, what A.5 actually has to do
to make the world look and feel great, and the load-bearing facts the
brainstorm should be informed by.
---
## TL;DR
N.5b just shipped: outdoor terrain rendering is on bindless + multi-draw
indirect via `TerrainModernRenderer`. Constant-cost dispatch as the
visible landblock count grows — radius=5 vs radius=15 are the same number
of GL calls for terrain.
**A.5's actual goal — verbatim from the user, 2026-05-09:**
> "I just want great smooth HIGH fps visuals. Should look great. As long
> as it scales and we get very high FPS"
That reframes priorities. We are NOT optimizing the inner loop at radius=5
(it's solved). We're scaling visual reach + scene density without the
client falling off a perf cliff.
**Concretely, A.5 ships three things:**
1. **Two-tier streaming.** Near tier (≤ N₁ landblocks) loads everything as
today (terrain + scenery + EnvCells + collision). Far tier (N₁ < r N)
loads terrain mesh ONLY. No scenery generation, no collision, no
entity registration for the far tier.
2. **Per-LB entity bucketing for the WB dispatcher.** Today the entity
dispatcher walks every loaded entity each frame for AABB cull —
~16K entities @ ~1µs/test = 4.3ms/frame, dominating the frame budget.
Bucket entities by landblock so the cull is hierarchical: cull the LB
first, then only walk entities inside surviving LBs.
3. **Off-thread mesh build.** `LandblockMesh.Build` currently runs on the
render thread when a new LB streams in. At today's radius=5 this is
invisible; at A.5's higher N₂ it becomes a visible frame-time spike
when 4-5 LBs stream simultaneously. Move the build to a worker pool;
hand finished `LandblockMeshData` back via a queue.
The headline win you're shooting for: **radius=15 sustains the user's
target FPS in Holtburg with no streaming hitches.**
---
## Where N.5b left things
### Branch state (relative to main)
After N.5b ships:
- N.5b SHIP at `08b7362` (final commit; appended SHIP record to plan)
- Roadmap entry, issue #51 closure, perf baseline doc all in place at `083c10c`
- Legacy `TerrainChunkRenderer` + `TerrainRenderer` + `terrain.vert/.frag`
deleted at `7dfa2af`. **The modern path is the only path.**
### Captured perf baseline (load-bearing for A.5's "what's actually hot")
From `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`, measured
2026-05-09 at Holtburg town dueling field, radius=5, ~30s standstill:
| Subsystem | cpu_us median per frame | Notes |
|---|---|---|
| **Entity dispatcher** (`WbDrawDispatcher`) | **~4,300** | 86% of frame budget. ~16K entities walked for AABB cull. THIS is the bottleneck. |
| Terrain dispatcher (`TerrainModernRenderer`) | ~6.4 | <1% of frame. Constant-cost regardless of radius (proved in N.5b). |
| Everything else (sky, particles, ImGui, swap, audio) | ~700 | Small. |
**Actual FPS at radius=5 in Holtburg: ~200 fps** (frame time ≈ 5ms).
NOT the "810 fps" inferred from the N.5 ship doc (that was 1/dispatcher_ms,
which is only the WB dispatcher CPU cost in isolation, not real frame time).
### What naive radius increase does
If you simply raised `ACDREAM_STREAM_RADIUS` to 15 today without A.5:
- Loaded landblocks: 121 → ~961 (8× more). Acceptable.
- Loaded entities: ~16K → ~125K (linear scaling with LB count). **NOT
acceptable.** At ~1µs per AABB cull, the entity dispatcher would take
~125ms/frame = 8 FPS. Slideshow.
- Memory footprint: similar 8× explosion in scenery instance buffers.
So the perf cliff is real and immediate. A.5 has to address it BEFORE
the radius can be safely raised.
### What N.5b set up that A.5 inherits
- **Modern terrain dispatcher.** `TerrainModernRenderer` is O(1) GL calls
in radius. As you add far-tier LBs (terrain only), the terrain
dispatcher cost stays flat (~6µs/frame). This is the one subsystem
that doesn't need any A.5 work — it just scales.
- **Slot allocator for terrain GPU buffers.** Already grows by power-of-two
doubling. Will absorb radius=15 (~961 slots × ~15 KB each = ~14 MB)
without manual tuning.
- **`[TERRAIN-DIAG]` instrumentation.** Reports per-frame median + p95 in
microseconds. Use this to confirm A.5 doesn't regress terrain perf.
- **Conformance sentinel.** `TerrainModernConformanceTests` proves visual
mesh Z agrees with `TerrainSurface.SampleZFromHeightmap` to 0.015 mm.
Don't break this — physics ↔ visual agreement must hold across both
tiers.
- **Bindless atlas.** `TerrainAtlas.GetBindlessHandles()`. The far tier
shares the atlas (it's region-wide). Zero atlas-related per-LB cost.
---
## The brainstorm questions (the hard calls A.5 has to make)
These are the questions to resolve in the brainstorm step. Bring them to
the user with options + recommendation; don't prejudge.
### 1. Tier radii: what are N₁ and N₂?
- **N₁** = near-tier radius (everything loads). Today's default `STREAM_RADIUS`.
Probably stays at 5 (or maybe 4; maybe 3).
- **N₂** = far-tier radius (terrain mesh only). Could be 8, 12, 15, 20.
Tradeoffs: bigger N₂ = more world visible = looks better. But each far-tier
LB still costs ~16 KB GPU memory + a frustum cull AABB + a slot allocation.
At N₂=15, that's ~961 LBs × 16 KB = ~15 MB GPU mem (cheap) + ~961 cull
tests (cheap, ~1ms total at 1µs each — and we'll do this per-LB cull
anyway as part of #2 below).
Verify against retail: cdb attach + check how many landblocks retail keeps
loaded at a given vantage point. Probably around 10-12 per the AC2D
references and the holtburger client's behavior.
### 2. Far tier: terrain only? Or also impostor scenery?
Two options:
- **Terrain only** (cleanest). Beyond N₁, no trees, no rocks. Skyline is the
terrain mesh against the sky.
- **Impostor scenery** (more retail-like). Beyond N₁, generate flat
billboards or low-poly trees instead of full meshes. Adds substantial
complexity (billboard pipeline, mesh-LOD generation, per-camera-angle
rotation).
Recommendation: start with terrain-only. Add impostors only if the
horizon looks wrong (too bare). Retail definitely has SOME distant
scenery but the cutoff is gradual; we can match it later if needed.
### 3. Entity bucketing structure
Today: `WbDrawDispatcher` keeps a flat dictionary of all entities and
walks all of them per frame. To bucket by LB, we need:
- A `Dictionary<uint, List<EntityHandle>>` keyed by landblock ID
- On `AddEntity(...)`, also stash it in the LB bucket (the spawn flow
already knows the LB context)
- On `RemoveEntity(...)`, remove from the LB bucket too
- Per frame: cull at LB granularity first; then cull entities only inside
surviving LBs
LB-level AABBs are already computed (per the existing `_visibleSlots`
logic in `TerrainModernRenderer` — the same AABB applies to entities,
modulo a Z-range bump for trees/buildings).
Open question: do entities outside a known LB exist? (Items dropped on the
ground? Ephemeral effects? Player projectiles?) If yes, they need a
fallback "unknown LB" bucket that's still walked every frame. Probably
small.
### 4. Where does the off-thread mesh build land?
Today `LandblockMesh.Build` runs synchronously inside `OnLandblockLoaded`
on the render thread. To move it off:
- `StreamingLoader` worker thread (already async for dat reads) signals
"LB X is ready"
- A new worker pool consumes that signal, builds the mesh on a worker
thread, posts the finished `LandblockMeshData` to a `ConcurrentQueue`
- Render thread drains the queue at the start of each frame, calling
`_terrain.AddLandblock(...)` for each ready mesh
Gotcha: the `TerrainBlendingContext` is shared. Need to confirm it's
read-only (it is — built once at startup). Also `_surfaceCache`
currently a plain `Dictionary` populated lazily by `TerrainBlending.BuildSurface`.
Either lock it, replace with `ConcurrentDictionary`, or pre-populate with
all known palCodes at startup.
### 5. Streaming hysteresis at the tier boundary
When the player crosses N₁ → near-tier shrinks, far-tier grows.
LBs that were near-tier need to:
- Drop their scenery (unregister entities)
- Drop their EnvCells
- Keep the terrain mesh (still in far tier)
When the player crosses back: the LB needs scenery + EnvCells re-loaded.
Hysteresis (don't churn at the exact boundary) is needed.
The streaming loader already has hysteresis for full LB load/unload. A.5
extends that: a separate hysteresis radius for the scenery/entity layer.
### 6. Visual quality wins to ride along
A.5 is the natural place to land 2-3 nearly-free quality wins:
- **Mipmapped terrain atlas + anisotropic 16x.** Today the atlas is
`GL_LINEAR` no mipmaps; distant terrain shimmers. ~half-day fix.
Big visible improvement at far tier.
- **Tree alpha-test → alpha-to-coverage with MSAA.** Today tree edges are
binary cutoff and pixel-edged. A2C with MSAA fixes them. ~one day.
- **Correct depth-write for transparent foliage.** Some scenery passes
may be writing depth incorrectly; confirm + fix.
These are not strictly required for A.5 to ship, but they amplify the
"looks great" payoff.
### 7. Acceptance metrics
The user's goal is "smooth + high FPS + great-looking + scales." Pin
this concretely:
- Target FPS at radius (whatever final N₁ + N₂): ≥ user's monitor refresh
(probably 144 or 240 Hz). Capture before/after numbers in a perf
baseline doc parallel to N.5b's.
- No frame-time spikes > 5ms during streaming (record a 60-second
trace running through Holtburg → North Yanshi).
- Visual horizon visible at the new N₂. Capture screenshots from the
same vantage point at the start of A.5 (before) and at ship (after)
for the SHIP record.
### 8. What's NOT in A.5
A.5 does not need to ship:
- GPU-side culling (compute-shader cull). Bigger lift; N.6 territory.
- Persistent-mapped indirect buffer. N.6 territory.
- Sky / particles / EnvCells migration. Separate N.7+ phases.
- Shadow mapping. Separate visual phase.
Don't let scope creep pull these in.
---
## Files to read before brainstorming
In rough order of relevance:
1. **`docs/research/2026-05-09-phase-n5b-handoff.md`** — N.5b's handoff
(read for context on what was just shipped + the structure of these
handoff docs).
2. **`docs/plans/2026-05-09-phase-n5b-perf-baseline.md`** — captured
perf numbers + the architectural reasoning for what A.5 inherits.
3. **`memory/project_phase_n5b_state.md`** — three high-value gotchas
captured during N.5b (especially #1: bindless uniform-sampler driver
quirk; A.5 won't directly need this, but it's the prior art for any
new shader code in the phase).
4. **`docs/plans/2026-04-11-roadmap.md`** A.5 entry — the original A.5
description.
5. **The streaming loader**`src/AcDream.Core/World/StreamingLoader.cs`
(or wherever it lives; grep for `OnLandblockLoaded`). Understand the
existing ring + hysteresis logic before extending it.
6. **WB dispatcher entity flow**
`src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` lines covering
`Draw` (the per-entity walk) and `EntitySpawnAdapter` (where entities
get registered). The bucketing change lands here.
7. **`LandblockMesh.Build`** — `src/AcDream.Core/Terrain/LandblockMesh.cs`.
Its inputs (heightmap, ctx, surfaceCache) determine what the worker
thread needs. ~150 lines.
8. **WB's `SceneryRenderManager`**
`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/SceneryRenderManager.cs`.
Has a render-distance cap; informs N₁ vs N₂ defaults.
9. **`TerrainModernRenderer`** —
`src/AcDream.App/Rendering/TerrainModernRenderer.cs`. Don't modify;
confirm the slot allocator handles radius=15 cleanly.
---
## Acceptance criteria for the whole phase
1. Build green; existing tests stay green; N.5b's conformance sentinel
still passes (visual mesh Z = TerrainSurface Z within 1mm).
2. **Far-tier LBs render terrain visibly past N₁** in user-driven visual
verification.
3. **Per-frame entity-dispatcher cpu_us at radius=N₁ drops** vs today
(the bucketing should help even at the current radius).
4. **Per-frame entity-dispatcher cpu_us at radius (N₁+N₂) is bounded**
— does NOT scale linearly with total loaded LBs. Specifically:
bucketed cull should be < 1.5× today's cost despite far-tier LBs
loading.
5. **No streaming hitch > 5ms** when running at run-speed across N₁/N₂
tier boundaries simultaneously (capture a 60s trace).
6. **`[TERRAIN-DIAG]` cpu_us stays flat** as N₂ grows — the terrain
dispatcher proven O(1) (regression check).
7. Visual identity at near-tier (no scenery missing inside N₁; no
z-fighting; no cell-boundary wobble — N.5b sentinel still applies).
8. SHIP record + perf baseline + memory entry written, mirroring N.5b's
pattern.
---
## What you'll be doing in the first 30 minutes
1. Read this handoff in full.
2. Read `docs/research/2026-05-09-phase-n5b-handoff.md` for the structural
pattern.
3. Read `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` for the captured
numbers A.5 inherits.
4. Read `memory/project_phase_n5b_state.md` for gotchas.
5. Verify build is green: `dotnet build`.
6. Verify N.5b ship is intact: `dotnet test --filter "FullyQualifiedName~TerrainSlot|FullyQualifiedName~TerrainModernConformance|FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"` (target ≥114 passing, 0 failures).
7. Capture a baseline radius=5 frame trace yourself (one launch, 30s
standstill at Holtburg dueling field) so you have a "before" number
in your own measurement environment, not just trusting N.5b's number.
8. Invoke `superpowers:brainstorming` with the user. Walk through the
8 brainstorm questions above. Present each with options + my
recommendation; don't prejudge.
9. After agreement, write the spec; then the plan; then execute
task-by-task using `superpowers:subagent-driven-development`.
Don't skip the brainstorm. The N₁/N₂ values, the bucketing structure
trade-offs, and the worker-thread design are real decisions with
downstream consequences that need user input — not "the agent makes a
call and goes."
---
## Things to NOT do
- **Don't raise `ACDREAM_STREAM_RADIUS` without A.5's tiered loading
in place.** The entity-cull cliff is immediate and severe (8 FPS at
naive radius=15).
- **Don't put scenery in the far tier just to "look more retail" without
a billboard/impostor pipeline.** Full-detail scenery in the far tier
is what causes the cull cliff.
- **Don't move `LandblockMesh.Build` to a worker thread without first
auditing `TerrainBlendingContext` + `_surfaceCache` for thread
safety.** Concurrent writes to the surfaceCache will produce
silently-wrong terrain blending.
- **Don't break the N.5b conformance sentinel.** If A.5 changes how
meshes are built (e.g., for the worker thread), the conformance
test must still pass — it's the load-bearing physics ↔ visual Z
agreement guard.
- **Don't bundle GPU-side culling, persistent-mapped buffers, or shadow
mapping into A.5.** Those are N.6+ territory; A.5 is "make the world
look big and not stutter."
- **Don't ship without honest perf numbers.** If A.5 doesn't actually
hit its FPS target, document why and ship N.6 next instead of
papering over it. The N.5b precedent is honest reporting.
- **Don't skip the visual verification gate.** Same lesson from N.5b's
black-terrain regression: "go" doesn't mean "verified." User must
actually launch the client at radius=N₂ and confirm the horizon
looks great + FPS hits target.
---
## Reference: where the FPS budget actually goes today
For brainstorming purposes, the per-frame breakdown at radius=5 / Holtburg
(real measurement, 2026-05-09):
```
~5,000 µs total frame time (= 200 fps)
├── 4,300 µs WbDrawDispatcher entity cull + dispatch ← THE BOTTLENECK
│ ~16K entity AABB tests / frame
│ A.5's entity bucketing attacks this directly
├── 6 µs TerrainModernRenderer
│ O(1) in radius. Won't grow with A.5. Already solved.
├── ~700 µs Sky, particles, ImGui, audio, swap-buffers, misc
│ Mostly fixed cost; some VSync-related
└── rest GPU side (we don't measure this — query plumbing
deferred to N.6). Could be substantial.
```
The first action of A.5 is to recognize that the perf claim "810 fps"
from N.5 was misleading. Don't repeat the mistake — measure the actual
frame time, not just one subsystem.
---
Good luck. The phase is meaty (~2 weeks) but the structural work is
well-shaped: tiered streaming has clear boundaries, entity bucketing is
an isolated dispatcher change, off-thread mesh build is a well-understood
worker pattern. The hard call is the N₁/N₂ values, and that's a
brainstorm question — bring it to the user with data.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,438 @@
# Phase N.5b — Terrain on the Modern Rendering Path — Design Spec
**Status:** Brainstormed 2026-05-09; not yet implemented.
**Author:** acdream lead engineer + Claude.
**Builds on:** Phase N.5 (`WbDrawDispatcher` on bindless + multi-draw indirect, shipped 2026-05-08).
**Predecessor docs (read first if you're new to this phase):**
- [`docs/research/2026-05-09-phase-n5b-handoff.md`](../../research/2026-05-09-phase-n5b-handoff.md) — cold-start briefing.
- [`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`](../plans/2026-05-08-phase-n5-modern-rendering.md) — N.5 plan + ship record.
- [`docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`](2026-05-08-phase-n5-modern-rendering-design.md) — N.5 spec; the substrate N.5b consumes.
- [`docs/ISSUES.md`](../../ISSUES.md) issue #51 — the load-bearing constraint this phase resolves.
---
## 1. Problem statement
N.5 lifted **entity** rendering onto bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher is 1.23 ms/frame median at Holtburg courtyard; ~810 fps sustained; ~12-15 GL calls/frame for entities regardless of scene complexity. Terrain is still on the older per-landblock pipeline (`TerrainChunkRenderer` at [src/AcDream.App/Rendering/TerrainChunkRenderer.cs](../../../src/AcDream.App/Rendering/TerrainChunkRenderer.cs)) — bind a per-chunk VAO + IBO, issue `glDrawElements` per visible chunk. At radius=2 that's ~25 GL calls/frame for terrain; at radius=5 it scales to ~121.
**N.5b's goal:** lift terrain rendering onto the same modern primitives N.5 just delivered, preserving the visible terrain pixel-for-pixel and preserving physics-vs-visual Z agreement (issue #51 / the cell-boundary wobble bug class).
The work is straightforward in shape — N.5's substrate (bindless wrapper, `DrawElementsIndirectCommand` struct, `[WB-DIAG]` instrumentation, two-phase Dispose pattern) is already built. The non-trivial decision is how to handle the formula divergence between WorldBuilder and retail.
---
## 2. The formula divergence (why Path A is dead)
WorldBuilder's `TerrainUtils.CalculateSplitDirection` ([references/WorldBuilder/.../TerrainUtils.cs:44-53](../../../references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44)) and acdream's `TerrainBlending.CalculateSplitDirection` ([src/AcDream.Core/Terrain/TerrainBlending.cs:56](../../../src/AcDream.Core/Terrain/TerrainBlending.cs:56)) use mathematically distinct formulas:
| | Formula | Source |
|---|---|---|
| acdream | `dw = x*y*0x0CCAC033 - x*0x421BE3BD + y*0x6C1AC587 - 0x519B8F25; bit31` | AC2D `Landblocks.cpp:346-350` |
| WB | `(seedA + 1813693831) - seedB - 1369149221 >= 0.5` (rescaled) where `seedA = (lbX*8+cellX)*214614067; seedB = (lbY*8+cellY)*1109124029` | clean-room reverse engineering |
**Verified retail authority:** the named retail decomp at [`docs/research/named-retail/acclient_2013_pseudo_c.txt`](../../research/named-retail/acclient_2013_pseudo_c.txt) lines 316042-316144 (function `CLandBlockStruct::ConstructPolygons` at retail address `00531d10`) contains the constants `0x0CCAC033 / 0x6C1AC587 / 0x421BE3BD / 0x519B8F25` verbatim. **Retail uses AC2D's formula.** acdream matches retail. **WB does not.**
**Quantified divergence** (per `tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`, sweep across 255×255 landblocks × 64 cells = 4,161,600 cells):
| Comparison | Disagreement rate |
|---|---|
| Raw enum output (WB enum vs acdream enum) | **50.02%** |
| Diagonal-actually-painted (post-correcting for WB's inverted enum semantics) | **49.98%** |
| Holtburg town (0xA9B0) | 29/64 cells (45.3%) wrong if using WB |
| Worst landblock (0x4D96) | 47/64 cells (73.4%) wrong if using WB |
| Best landblock (0x0478) | 17/64 cells (26.6%) wrong if using WB |
The two formulas behave like independent random hashes. Adopting WB's pipeline wholesale (Path A) would visibly mis-render ~half the diagonals on every landblock — the cell-boundary wobble bug class would be present everywhere.
**Path A is dead.** N.5b commits to Path C (see Decision 1 below): use WB's *renderer* pattern (single global VBO/EBO + slot allocator + multi-draw indirect), driven by acdream's existing `LandblockMesh.Build` which uses retail's formula.
---
## 3. Decisions log
The eight brainstorm outcomes, locked.
| # | Decision | Choice | Reason |
|---|---|---|---|
| 1 | Formula source for cell split direction | **Path C — WB renderer pattern, acdream's `LandblockMesh.Build` + `TerrainBlending.CalculateSplitDirection`** (retail's formula) | Path A measured 49.98% diagonal-painted divergence vs retail. Path B (fork-patch WB) is permanent maintenance burden. Path C keeps a known-working asset and avoids fork friction. Same per-frame perf as either alternative. |
| 2 | Atlas model | **Keep `TerrainAtlas` (palCode-based fragment blending) + add bindless handles** | Visual correctness already locked in. Bindless wrapper is ~50 lines, cookie-cutter from N.5's `TextureCache.MakeResidentHandle` pattern. No perf win from adopting WB's `LandSurfaceManager`. |
| 3 | Mesh ownership | **Single global VBO/EBO + slot allocator, one slot per landblock** | Required for `glMultiDrawElementsIndirect` to actually win — per-LB IBOs would force per-LB binds, defeating the point. Mirrors N.5's pattern + WB's pattern. |
| 4 | Index format | **uint32 + baseVertex baked into indices on upload** | Matches WB's pattern verbatim ("maximum driver compatibility"). 192 KB extra IBO at 256 slots — rounding error vs vertex bytes. Future-proofs A.5's higher radius. |
| 5 | Shader unification | **Separate `terrain_modern.vert/.frag`** | Vertex layouts are meaningfully different (terrain: 6 attribs incl. palCode; entities: position+UV+normal+per-instance matrix). Unifying forces dead code on both sides; no perf win. |
| 6 | Streaming integration | **Mirror WB's slot allocator (free-list `Queue<int>` + power-of-two grow). Skip WB's 15s unload delay.** | Free-list standard; grow-by-doubling matches N.5 buffer growth pattern. The 15s delay would compete with `StreamingLoader`'s existing hysteresis — let one component own lifecycle policy. |
| 7 | Conformance test | **Pure-CPU sweep: visual mesh Z = `TerrainSurface.SampleZFromHeightmap` within 1mm, 10 representative landblocks × 100 sample points** | The exact issue #51 sentinel. ~1,000 assertions/run, <100ms, no GL infrastructure needed. Catches any silent formula or vertex-layout drift. |
| 8 | Visual verification gate | **4 outdoor scenes (Holtburg flat + sloped, Foundry-area, sloped LB) × 6 visual checks** | Outdoor-only — interiors / dungeons / EnvCells are out of scope and not testable yet. The wobble check is the load-bearing #51 sentinel. |
---
## 4. Architecture overview
### Per-frame draw flow
```
TerrainModernRenderer.Draw(camera, frustum, neverCullId):
1. Walk all loaded slots → per-slot frustum cull (AABB test).
Build _visibleSlots list (in-place reuse, no per-frame alloc).
2. If _visibleSlots.Count == 0: early-out.
3. Build per-frame DEIC array, one entry per visible slot:
DrawElementsIndirectCommand {
Count = 384, // verts/landblock
InstanceCount= 1,
FirstIndex = slot.FirstIndex, // baked offset into global IBO
BaseVertex = 0, // already baked into indices
BaseInstance = 0
}
4. If _drawIndirectCapacity < _visibleSlots.Count:
delete + re-allocate _indirectBuffer (power-of-two grow).
glBufferSubData(DRAW_INDIRECT_BUFFER, 0, sizeof(DEIC) * _visibleSlots.Count, deicArray)
5. shader.Use() // terrain_modern
6. Bind global VAO (_globalVao)
7. Set bindless handle uniforms: glProgramUniformHandleARB for uTerrain + uAlpha
8. Bind DRAW_INDIRECT_BUFFER (_indirectBuffer)
9. glMemoryBarrier(GL_COMMAND_BARRIER_BIT)
10. glMultiDrawElementsIndirect(Triangles, UnsignedInt, indirect=0,
drawcount=_visibleSlots.Count, stride=sizeof(DEIC))
11. Unbind VAO.
GL calls per frame for terrain: ~6-8 fixed.
- 1× shader.Use
- 1× BindVertexArray
- 2× ProgramUniformHandleARB (atlas handles)
- 1× BindBuffer for DRAW_INDIRECT_BUFFER
- 1× BufferSubData for DEIC array
- 1× MemoryBarrier
- 1× MultiDrawElementsIndirect
- 1× BindVertexArray(0)
```
### Per-landblock-load flow (streaming integration)
```
TerrainModernRenderer.AddLandblock(id, meshData, worldOrigin):
1. If id already present: RemoveLandblock(id) first (replaces).
2. Bake worldOrigin into vertex positions (CPU; ~12µs per landblock).
3. Acquire slot:
if _freeSlots.TryDequeue: reuse
else: slot = _nextFreeSlot++; if needed, EnsureCapacity(_nextFreeSlot).
4. Compute slot offsets:
slotByteOffset_VBO = slot * 384 * 40 bytes (15,360 bytes per slot)
slotByteOffset_IBO = slot * 384 * 4 bytes (1,536 bytes per slot)
firstIndex = slot * 384
baseVertex = slot * 384
5. Bake baseVertex into indices on CPU (indices[i] += baseVertex).
6. glBufferSubData(VBO, slotByteOffset_VBO, vertBytes, vertData).
7. glBufferSubData(IBO, slotByteOffset_IBO, idxBytes, bakedIndices).
8. Compute slot AABB (worldOrigin.x, worldOrigin.y, minZ, +192, +192, maxZ).
9. Store SlotData {id, worldOrigin, firstIndex, indexCount, aabbMin, aabbMax}.
10. _idToSlot[id] = slot.
TerrainModernRenderer.RemoveLandblock(id):
1. _idToSlot.TryGetValue(id) → slot.
2. _freeSlots.Enqueue(slot); _idToSlot.Remove(id); _slots[slot] = null.
(No GPU clear — DEIC list won't reference unused slots.)
EnsureCapacity(requiredSlots):
newCap = max(initialCapacity, currentCap * 2)
while newCap < requiredSlots: newCap *= 2.
Allocate new VBO + IBO at new size.
glCopyBufferSubData old → new (preserve loaded slot data).
Delete old; recreate VAO pointing at new VBO+IBO.
```
### Relation to N.5's existing dispatcher
`TerrainModernRenderer` is structurally **parallel** to `WbDrawDispatcher`, not nested under it. They share:
- `BindlessSupport` wrapper for `ARB_bindless_texture` calls
- `DrawElementsIndirectCommand` struct (20-byte layout)
- `[WB-DIAG]` instrumentation pattern (CPU `Stopwatch` + GPU `GL_TIME_ELAPSED` queries)
- `SceneLighting` UBO at binding=1
But they're separate dispatchers with separate global buffers, separate VAOs, separate shaders. Per frame, `GameWindow.Draw` calls them in sequence:
1. `_wbDrawDispatcher.Draw(...)` — entities (opaque + transparent passes)
2. `_terrainModern.Draw(...)` — terrain (single opaque pass)
3. Sky / particles / debug / UI on legacy paths until later phases retire them.
---
## 5. Component changes
### Files added
| File | Purpose | Approx. size |
|---|---|---|
| `src/AcDream.App/Rendering/TerrainModernRenderer.cs` | The new dispatcher. Owns global VBO/EBO + slot allocator + per-frame DEIC build + `glMultiDrawElementsIndirect` dispatch. | ~400-500 lines |
| `src/AcDream.App/Rendering/TerrainSlotAllocator.cs` | Pure-CPU helper extracted for unit testing: free-list slot management + DEIC array builder. | ~150 lines |
| `src/AcDream.App/Rendering/Shaders/terrain_modern.vert` | Vertex shader. Same per-cell layout as today's `terrain.vert` (locations 0-5). Reads bindless atlas handles via uniform. Same `SceneLighting` UBO at binding=1. Same per-vertex AdjustPlanes lighting bake. | ~150 lines |
| `src/AcDream.App/Rendering/Shaders/terrain_modern.frag` | Fragment shader. Same `combineOverlays` + `combineRoad` + `maskBlend3` as today's `terrain.frag`. Samples bindless `sampler2DArray` handles via `GL_ARB_bindless_texture` extension. Same fog + lightning flash + atmosphere. | ~150 lines |
| `tests/AcDream.Core.Tests/Terrain/TerrainModernConformanceTests.cs` | The Z-conformance sentinel for issue #51's bug class. ~10 representative landblocks × ~100 sample points; asserts `\|meshTriZ - TerrainSurface.SampleZFromHeightmap\| < 0.001m`. | ~150 lines |
| `tests/AcDream.Core.Tests/Rendering/TerrainSlotAllocatorTests.cs` | Unit tests for the slot allocator (free-list correctness, capacity grow, AABB tracking) + DEIC build correctness. Pure CPU; no GL. | ~200 lines |
### Files modified
| File | Change |
|---|---|
| `src/AcDream.App/Rendering/TerrainAtlas.cs` | Add `GetBindlessHandles()` returning `(ulong terrain, ulong alpha)`. Mirrors N.5's `TextureCache.MakeResidentHandle` pattern: generate handle once at first call, make resident, cache. The existing `GlTexture` / `GlAlphaTexture` `uint` properties stay (no legacy callers to migrate yet, but the path is preserved). |
| `src/AcDream.App/Rendering/GameWindow.cs` | Field declaration ([line 21](../../../src/AcDream.App/Rendering/GameWindow.cs:21)): `_terrain` field type `TerrainChunkRenderer? → TerrainModernRenderer?`. Construction ([line 1391](../../../src/AcDream.App/Rendering/GameWindow.cs:1391)): `new TerrainChunkRenderer(gl, shader, atlas)``new TerrainModernRenderer(gl, bindless, shader, atlas)`. Wire the `[TERRAIN-DIAG]` rollup callback (mirror the existing `[WB-DIAG]` callback wiring). |
| `docs/plans/2026-04-11-roadmap.md` | N.5b → "Shipped" row on completion; N.6 entry refreshed to remove "terrain on modern path" from scope. |
| `docs/ISSUES.md` | Issue #51 → "Recently closed" with the SHIP commit SHA. |
| `CLAUDE.md` "WB integration cribs" section | Add the N.5b crib: terrain dispatcher mirror of WB's pattern, retail-formula preserved via `LandblockMesh.Build` + `TerrainBlending.CalculateSplitDirection`. |
| `memory/project_phase_n5b_state.md` (new memory file) | Captures any high-value gotchas discovered during N.5b implementation (analogous to `project_phase_n5_state.md`'s three gotchas). |
### Files deleted
| File | Reason |
|---|---|
| `src/AcDream.App/Rendering/TerrainChunkRenderer.cs` (454 lines) | Replaced by `TerrainModernRenderer`. |
| `src/AcDream.App/Rendering/TerrainRenderer.cs` (247 lines) | Older sibling — already not wired in production. Has no users. Goes away in the same commit as `TerrainChunkRenderer`. |
| `src/AcDream.App/Rendering/Shaders/terrain.vert` (147 lines) | Replaced by `terrain_modern.vert`. |
| `src/AcDream.App/Rendering/Shaders/terrain.frag` (149 lines) | Replaced by `terrain_modern.frag`. |
### Net diff
- Adds: ~6 files, ~1,200 lines (renderer + slot-allocator + 2 shaders + 2 test files)
- Removes: ~4 files, ~1,000 lines (2 old renderers + 2 old shaders)
- Net: ~+200 lines for the same visual output, with the dispatcher collapsed to ~6-8 GL calls/frame regardless of scene size
### Public API of `TerrainModernRenderer`
```csharp
public sealed class TerrainModernRenderer : IDisposable
{
public TerrainModernRenderer(
GL gl,
BindlessSupport bindless,
Shader terrainModernShader,
TerrainAtlas atlas,
int initialSlotCapacity = 64);
public void AddLandblock(uint landblockId, LandblockMeshData mesh, Vector3 worldOrigin);
public void RemoveLandblock(uint landblockId);
public void Draw(ICamera camera, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null);
public int LoadedSlots { get; } // for [TERRAIN-DIAG]
public int VisibleSlots { get; } // for [TERRAIN-DIAG]
public int CapacitySlots { get; } // for [TERRAIN-DIAG]
public void Dispose();
}
```
Same external interface as today's `TerrainChunkRenderer` (`AddLandblock` + `RemoveLandblock` + `Draw`). Drop-in at `GameWindow.cs:1391`.
---
## 6. Vertex format & shader
### Vertex format: `TerrainVertex` stays as-is (40 bytes)
```csharp
[StructLayout(LayoutKind.Sequential)]
public readonly record struct TerrainVertex(
Vector3 Position, // 12 bytes — world-space (worldOrigin baked in by AddLandblock)
Vector3 Normal, // 12 bytes — per-vertex from central-difference (Phase 3b)
uint Data0, // 4 bytes — base+ovl0 tex/alpha indices
uint Data1, // 4 bytes — ovl1+ovl2 tex/alpha indices
uint Data2, // 4 bytes — road0+road1 tex/alpha indices
uint Data3); // 4 bytes — rotations + splitDir bit
// total: 40 bytes
```
Already correct, already debugged. Per-vertex normal is preserved because retail bakes AdjustPlanes lighting at the vertex stage — losing it would re-introduce the "warmer / less blue than retail" regression researched in [`docs/research/2026-04-24-lambert-brightness-split.md`](../../research/2026-04-24-lambert-brightness-split.md).
VAO attribute layout (locations 0-5, unchanged from today's `terrain.vert`):
| Loc | Type | Source | Purpose |
|---|---|---|---|
| 0 | vec3 (3 floats) | Position offset 0 | world-space position |
| 1 | vec3 (3 floats) | Normal offset 12 | per-vertex normal |
| 2 | uvec4 (4 bytes) | Data0 offset 24 | base+ovl0 tex/alpha |
| 3 | uvec4 (4 bytes) | Data1 offset 28 | ovl1+ovl2 tex/alpha |
| 4 | uvec4 (4 bytes) | Data2 offset 32 | road0+road1 tex/alpha |
| 5 | uvec4 (4 bytes) | Data3 offset 36 | rotations + splitDir |
### Shader: `terrain_modern.vert/.frag`
The structural change vs today's `terrain.vert/.frag` is small. The blend math, lighting bake, fog, lightning flash all stay verbatim. The only change is how textures are bound:
```glsl
// terrain_modern.frag — preamble
#version 460 core
#extension GL_ARB_bindless_texture : require
uniform sampler2DArray uTerrain; // 64-bit bindless handle, set per-frame
uniform sampler2DArray uAlpha; // 64-bit bindless handle, set per-frame
// SceneLighting UBO at binding=1 (unchanged from today)
layout(std140, binding = 1) uniform SceneLighting { ... };
// rest is unchanged from today's terrain.frag — combineOverlays, combineRoad,
// maskBlend3, applyFog, lightning flash are line-for-line identical
```
C# side per frame:
```csharp
// once at startup or first Draw, after atlas is built:
var (terrainHandle, alphaHandle) = atlas.GetBindlessHandles();
// MakeTextureHandleResidentARB called inside GetBindlessHandles, mirror N.5's pattern
// per frame:
shader.Use();
gl.ProgramUniformHandleARB(shader.Program, uTerrainLoc, terrainHandle);
gl.ProgramUniformHandleARB(shader.Program, uAlphaLoc, alphaHandle);
// ... bind global VAO + DEIC + glMultiDrawElementsIndirect
```
The bindless extension makes texture access syntactically identical to today's `sampler2DArray` uniform — the only difference is *how* the sampler is set on the C# side. GLSL doesn't know it's bindless.
### SSBO/UBO binding map (cross-checked with N.5)
| Binding | Type | Owner | Used by |
|---|---|---|---|
| SSBO=0 | `Instances[]` (mat4) | `WbDrawDispatcher` | `mesh_modern.vert` |
| SSBO=1 | `Batches[]` (handle+layer+flags) | `WbDrawDispatcher` | `mesh_modern.vert/.frag` |
| **SSBO=2** | (reserved) | — | future per-batch terrain data when A.5 wants per-LB atlas variation |
| UBO=1 | `SceneLighting` | `GameWindow` (set once/frame) | `mesh_modern.frag`, `terrain_modern.vert/.frag`, `sky.frag`, etc. |
N.5b doesn't introduce a new SSBO. The atlas handles are uniforms, not SSBO entries — atlas is region-wide so per-frame upload is two `uvec2`s (16 bytes), not worth the SSBO machinery. SSBO=2 stays available for future per-batch terrain data.
### What's preserved bit-for-bit from today's shaders
- `unpackOverlayLayer(...)` (rotation logic for overlays)
- The `gl_VertexID % 6 → corner` table for both SWtoNE and SEtoNW splits (the geometry mapping that was debugged 2026-04-21 to match ACE's `ConstructPolygons`)
- `MIN_FACTOR = 0.0` for the AdjustPlanes Lambert floor (the brightness research)
- `combineOverlays` + `combineRoad` + `maskBlend3` fragment math
- `applyFog` distance-blend
- Lightning flash additive overlay
- Per-vertex sun + ambient bake into `vLightingRGB`
---
## 7. Conformance + verification
### CPU unit tests (no GL required)
**`tests/AcDream.Core.Tests/Rendering/TerrainSlotAllocatorTests.cs`** — exercises the dispatcher's pure-CPU pieces in isolation:
| Test | Asserts |
|---|---|
| `Add_FirstLandblock_GetsSlotZero` | `_nextFreeSlot` starts at 0; first add uses slot 0 |
| `Add_SecondLandblock_GetsSlotOne` | Sequential adds use sequential slots |
| `RemoveThenAdd_ReusesFreedSlot` | Free-list FIFO: remove slot 0, add new LB → slot 0 again |
| `Add_BeyondInitialCapacity_DoublesCapacity` | After 64 adds, 65th triggers grow to 128 |
| `AddSameId_ReplacesExistingSlot` | Re-adding an LB id replaces in same slot (no leak) |
| `Build_DeicArray_VisibleSlotsOnly` | DEIC array has one entry per visible slot, `firstIndex = slot * 384`, `count = 384` |
| `Build_DeicArray_EmptyVisible` | No visible → empty array |
| `Aabb_StoredFromWorldOrigin` | Slot's AABB is `(origin.x, origin.y, minZ)..(origin.x+192, origin.y+192, maxZ)` |
**`tests/AcDream.Core.Tests/Terrain/TerrainModernConformanceTests.cs`** — the Z-conformance sentinel for issue #51's bug class.
Pattern modeled on the existing `ClientConformanceTests.cs`. For each landblock:
1. Load real dat heightmap data (10 representative landblocks: Holtburg flat 0xA9B0, Holtburg sloped 0xA9B1, Foundry 0x8080, Cragstone 0xCB99, Direlands sample 0xC040, plus 5 randomly-chosen sloped landblocks from a fixed seed for variety).
2. Build mesh via `LandblockMesh.Build(...)` (the source-of-truth generator that `TerrainModernRenderer` calls internally).
3. For 100 (localX, localY) sample points uniformly distributed in `[0, 192] × [0, 192]`:
- Compute `meshTriZ`: find the triangle in the built mesh containing the point, barycentric-interpolate Z from its three vertex Zs.
- Compute `physicsZ = TerrainSurface.SampleZFromHeightmap(heights, heightTable, lbX, lbY, localX, localY)`.
- Assert `|meshTriZ - physicsZ| < 0.001m` (1 mm tolerance — well below visible threshold).
4. Total: 10 landblocks × 100 points = 1,000 assertions per run; runs in <100 ms.
If this test fires, the pipeline has silently drifted (different formula somewhere, swapped vertex order, baseVertex baked wrong, etc.) — the exact bug class issue #51 names.
### Existing tests stay green
| Test file | Proves | N.5b impact |
|---|---|---|
| `TerrainBlendingTests.cs` | `CalculateSplitDirection` returns retail's formula | unchanged — still passes |
| `LandblockMeshTests.cs` | `LandblockMesh.Build` produces correct triangles | unchanged — still passes |
| `ClientConformanceTests.cs` | Existing conformance sweep | unchanged — still passes |
| `SplitFormulaDivergenceTest.cs` | WB↔retail divergence is real (49.98%) | unchanged — runs as data documentation; passes |
| All 71 tests in N.5 filter (Wb+MatrixComposition+TextureCacheBindless) | N.5 ship intact | unchanged — terrain is a separate dispatcher |
### `[TERRAIN-DIAG]` instrumentation
A new dedicated `[TERRAIN-DIAG]` log line, parallel to the existing `[WB-DIAG]` line, so terrain perf is observable independent of entity perf. Two parallel dispatchers, two parallel diag lines:
```
[TERRAIN-DIAG] cpu_ms=avg/95th draws=N/frame visible=N loaded=N capacity=N
```
- `cpu_ms``Stopwatch` around `TerrainModernRenderer.Draw`. Median + 95th percentile over the 5-second rollup window.
- `draws` — DEIC drawcount param (number of visible landblocks dispatched per `glMultiDrawElementsIndirect` call). Should be 6-8 GL calls fixed per frame regardless of `draws` value.
- `visible` / `loaded` / `capacity` — slot accounting; for spotting growth or leaks.
- `gpu_ms``GL_TIME_ELAPSED` query around the indirect dispatch. Same double-buffering caveat as N.5 (deferred to N.6 perf polish; will report `0/0` until then).
### Visual verification gate (user runs the client)
**Scenes** (drive the character through each):
1. **Holtburg town** (~0xA9B0 area) — flat terrain + roads
2. **Holtburg sloped landblock** (~0xA9B1) — slopes + cell-boundary diagonal transitions
3. **Foundry-area** (~0x80xx) — different blend palette
4. **Any visibly-sloped outdoor landblock** — Direlands or wherever you regularly test slope behavior
**Checks** at each scene:
1. **No cell-boundary wobble** — the load-bearing #51 sentinel
2. **No missing chunks / black holes** — slot allocator or DEIC misalignment
3. **No texture seams at landblock edges** — pre-N.5b regression check
4. **No z-fighting** — pre-N.5b regression check
5. **`[TERRAIN-DIAG] draws=N` ~6-8 GL calls/frame regardless of N**
6. **`[TERRAIN-DIAG] cpu_ms` at radius=5 is ≥10% lower** than the pre-N.5b baseline (recorded in `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`)
Acceptance: all six checks pass in all four scenes. **Outdoor-only — interiors / dungeons / EnvCells are out of scope and not testable yet**.
---
## 8. Acceptance criteria
1. Build green; existing tests stay green; new conformance test passes (`|deltaZ| < 1mm` across the sweep).
2. Visual identity to today confirmed at the four user-verification scenes.
3. `[TERRAIN-DIAG]` shows terrain at ~6-8 GL calls/frame regardless of scene size (vs today's 25-121).
4. No cell-boundary wobble at any visited landblock (the #51 sentinel).
5. **CPU dispatcher time at radius=5 ≥10% lower** than today's `TerrainChunkRenderer` per-LB-binds path. Measured via the `[TERRAIN-DIAG] cpu_ms` median over a 5-second rollup at the Holtburg test scene with radius=5; before/after numbers captured into `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` (mirror N.5's perf baseline doc convention).
6. Issue #51 closed in `docs/ISSUES.md` with the SHIP commit SHA.
---
## 9. Out-of-scope (explicit boundaries)
N.5b does **not** ship any of these. Each is a separate phase or backlog item:
- **EnvCells / interior cells / dungeons** — different mesh source (cell-bound static geometry, not heightmap). Future phase, not currently scoped on the roadmap.
- **Sky rendering** (`SkyRenderer.cs`) — N.8 territory.
- **Particle rendering** (`ParticleRenderer.cs`) — N.8 territory.
- **Two-tier streaming + horizon LOD** (A.5) — separate brainstorm. Different streaming primitive (visible window split into "near tier" full-detail and "far tier" coarse-LOD). N.5b deliberately doesn't touch streaming radius or LOD machinery.
- **WB's `LandSurfaceManager` adoption** — Decision 2 explicitly keeps `TerrainAtlas`. Revisit only if a specific feature requires per-landblock alpha-mask bake.
- **WB's `TerrainGeometryGenerator` adoption** — Path C explicitly keeps acdream's `LandblockMesh.Build` as the source of truth. Don't call into WB's generator.
- **Fork-patching WB upstream** — Path C avoids this entirely. The WB submodule stays clean.
- **Persistent-mapped buffers / GPU-side culling / GL_TIME_ELAPSED double-buffering** — N.6 perf polish territory; not in N.5b scope.
- **Per-instance terrain "highlight" or per-LB tint** — no analogue need today; defer to backlog if a use case appears.
- **Removing `Texture2D` / `sampler2D` legacy texture path** — N.6 cleanup once Sky/Terrain/Debug/particle paths all migrate. N.5b only adds the `Texture2DArray` bindless path; legacy stays for non-terrain consumers.
- **Visual changes** — terrain renders pixel-for-pixel identical to today (same vertex layout, same blend math, same lighting bake). The phase is purely a dispatch-mechanism upgrade. Any visible diff means a bug, not a feature.
---
## 10. Implementation guidance
The phase is sized at ~1 week. Tasks decompose into ~10 mostly-parallel chunks:
1. **`TerrainAtlas` bindless extension** — add `GetBindlessHandles()` method. ~50 lines. Independent of dispatcher.
2. **`TerrainSlotAllocator`** — pure-CPU helper class. ~150 lines. Independent of GL.
3. **`TerrainSlotAllocatorTests`** — unit tests for #2. ~200 lines. Depends on #2.
4. **`terrain_modern.vert`** — port of today's `terrain.vert` with bindless preamble. ~150 lines. Independent.
5. **`terrain_modern.frag`** — port of today's `terrain.frag` with bindless preamble. ~150 lines. Independent.
6. **`TerrainModernRenderer`** — dispatcher class wiring slot allocator + GL state + bindless handle uniforms + DEIC dispatch. ~400 lines. Depends on #1, #2.
7. **`TerrainModernConformanceTests`** — Z-conformance sentinel. ~150 lines. Depends on `LandblockMesh.Build` (existing).
8. **`GameWindow` integration** — swap `TerrainChunkRenderer``TerrainModernRenderer` at field+construction; add `[TERRAIN-DIAG]` rollup. ~30 lines. Depends on #6.
9. **Delete legacy**`TerrainChunkRenderer.cs`, `TerrainRenderer.cs`, `terrain.vert`, `terrain.frag`. Depends on #8 working in production.
10. **Roadmap + ISSUES.md + memory** — close issue #51, update CLAUDE.md "WB integration cribs", write `memory/project_phase_n5b_state.md`. Depends on #8 + visual verification.
Tasks 1, 2, 4, 5, 7 can land in parallel. Task 6 depends on 1+2. Task 8 depends on 6. Tasks 9 and 10 are post-verification cleanup.
The plan document (next step after this spec) breaks each task into TDD-style subtasks with clear acceptance gates per subagent dispatch.

View file

@ -18,8 +18,10 @@ public sealed class GameWindow : IDisposable
private IWindow? _window;
private GL? _gl;
private IInputContext? _input;
private TerrainChunkRenderer? _terrain;
private Shader? _shader;
private TerrainModernRenderer? _terrain;
/// <summary>Phase N.5b: terrain_modern.vert/.frag program. Owned by
/// <see cref="_terrain"/> at draw time but allocated + disposed here.</summary>
private Shader? _terrainModernShader;
private CameraController? _cameraController;
private IMouse? _capturedMouse;
private DatCollection? _dats;
@ -68,6 +70,15 @@ public sealed class GameWindow : IDisposable
private string _lastNearestObjLabel = "-";
private bool _lastColliding;
// Phase N.5b: CPU timing for [TERRAIN-DIAG] under ACDREAM_WB_DIAG=1
// (parallel diagnostic to [WB-DIAG] in WbDrawDispatcher — same env var
// gate so flipping one switch turns on both dispatcher rollups). Mirrors
// the rolling-256-sample buffer pattern from WbDrawDispatcher.
private readonly System.Diagnostics.Stopwatch _terrainCpuStopwatch = new();
private readonly long[] _terrainCpuSamples = new long[256]; // microseconds
private int _terrainCpuSampleCursor;
private long _terrainLastDiagTick;
// Phase A.1: streaming fields replacing the one-shot _entities list.
private AcDream.App.Streaming.LandblockStreamer? _streamer;
private AcDream.App.Streaming.GpuWorldState _worldState = new();
@ -965,9 +976,13 @@ public sealed class GameWindow : IDisposable
_gl.Enable(EnableCap.DepthTest);
string shadersDir = Path.Combine(AppContext.BaseDirectory, "Rendering", "Shaders");
_shader = new Shader(_gl,
Path.Combine(shadersDir, "terrain.vert"),
Path.Combine(shadersDir, "terrain.frag"));
// Phase N.5b: terrain_modern shader pair — bindless texture handles +
// glMultiDrawElementsIndirect dispatch path. The only terrain shader
// since Task 9 retired the legacy terrain.vert/.frag program.
_terrainModernShader = new Shader(_gl,
Path.Combine(shadersDir, "terrain_modern.vert"),
Path.Combine(shadersDir, "terrain_modern.frag"));
// Phase G.1/G.2: shared scene-lighting UBO. Stays bound at
// binding=1 for the lifetime of the process — every shader that
@ -1385,10 +1400,44 @@ public sealed class GameWindow : IDisposable
// TimeSync arrives.
WorldTime.SyncFromServer(AcDream.Core.World.DerethDateTime.DayTicks / 16.0); // = 476.25 = Midsong (noon)
// Build the terrain atlas once from the Region dat.
var terrainAtlas = AcDream.App.Rendering.TerrainAtlas.Build(_gl, _dats);
// N.5: detect ARB_bindless_texture + ARB_shader_draw_parameters BEFORE
// building the terrain atlas / renderer — both consume BindlessSupport
// (atlas via Texture2DArray bindless handles, renderer for SSBO uploads).
// The modern path (SSBO + glMultiDrawElementsIndirect + bindless textures)
// is mandatory as of Phase N.5 — missing extensions throw at startup with
// a clear error so users can file a real bug report rather than silently
// falling back to a half-working renderer.
if (AcDream.App.Rendering.Wb.BindlessSupport.TryCreate(_gl, out var bindless))
{
if (bindless!.HasShaderDrawParameters(_gl))
{
_bindlessSupport = bindless;
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
}
else
{
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present — modern path not available");
}
}
else
{
Console.WriteLine("[N.5] GL_ARB_bindless_texture not present — modern path not available");
}
_terrain = new TerrainChunkRenderer(_gl, _shader, terrainAtlas);
if (_bindlessSupport is null)
{
throw new NotSupportedException(
"acdream requires GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters " +
"(GL 4.3+ with bindless support). Your GPU/driver does not expose these extensions. " +
"If this is unexpected, please file a bug report with your GPU vendor + driver version.");
}
// Build the terrain atlas once from the Region dat. Phase N.5b: the
// atlas exposes bindless handles for the modern terrain path, so
// BindlessSupport is threaded through.
var terrainAtlas = AcDream.App.Rendering.TerrainAtlas.Build(_gl, _dats, _bindlessSupport);
_terrain = new TerrainModernRenderer(_gl, _bindlessSupport, _terrainModernShader!, terrainAtlas);
int centerX = (int)((centerLandblockId >> 24) & 0xFFu);
int centerY = (int)((centerLandblockId >> 16) & 0xFFu);
@ -1418,35 +1467,8 @@ public sealed class GameWindow : IDisposable
_heightTable = heightTable;
_surfaceCache = new Dictionary<uint, AcDream.Core.Terrain.SurfaceInfo>();
// N.5: detect ARB_bindless_texture + ARB_shader_draw_parameters.
// The modern path (SSBO + glMultiDrawElementsIndirect + bindless textures)
// is mandatory as of Phase N.5 — missing extensions throw at startup with
// a clear error so users can file a real bug report rather than silently
// falling back to a half-working renderer.
if (AcDream.App.Rendering.Wb.BindlessSupport.TryCreate(_gl, out var bindless))
{
if (bindless!.HasShaderDrawParameters(_gl))
{
_bindlessSupport = bindless;
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
}
else
{
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present — modern path not available");
}
}
else
{
Console.WriteLine("[N.5] GL_ARB_bindless_texture not present — modern path not available");
}
if (_bindlessSupport is null)
{
throw new NotSupportedException(
"acdream requires GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters " +
"(GL 4.3+ with bindless support). Your GPU/driver does not expose these extensions. " +
"If this is unexpected, please file a bug report with your GPU vendor + driver version.");
}
// (Bindless detection moved above — must precede TerrainAtlas.Build /
// TerrainModernRenderer ctor so they can consume BindlessSupport.)
// Mesh shader always loads (modern path is the only path).
_meshShader = new Shader(_gl,
@ -4723,7 +4745,7 @@ public sealed class GameWindow : IDisposable
float localY = spawn.LocalPosition.Y;
// Prefer the physics engine's terrain sampler (TerrainSurface.SampleZ)
// — it uses the same AC2D render split-direction formula the
// TerrainChunkRenderer uses for the visible terrain mesh. This
// TerrainModernRenderer uses for the visible terrain mesh. This
// guarantees trees are placed on the SAME Z height the player
// walks on. If physics hasn't registered this landblock yet,
// fall back to the local bilinear sample.
@ -6314,7 +6336,19 @@ public sealed class GameWindow : IDisposable
goto SkipWorldGeometry;
}
// Phase N.5b: wrap Draw in CPU stopwatch for [TERRAIN-DIAG] rollup
// (gated on ACDREAM_WB_DIAG=1, same env var as [WB-DIAG]). Stopwatch
// is cheap; only the periodic Console.WriteLine is gated.
_terrainCpuStopwatch.Restart();
_terrain?.Draw(camera, frustum, neverCullLandblockId: playerLb);
_terrainCpuStopwatch.Stop();
// Multiply by 100 then divide by 100 in the diag print to keep
// 0.01 µs precision in the long-typed sample buffer. Terrain Draw
// is sub-microsecond on simple scenes; truncating to integer µs
// would round nearly every sample to 0.
_terrainCpuSamples[_terrainCpuSampleCursor] = (long)(_terrainCpuStopwatch.Elapsed.TotalMicroseconds * 100.0);
_terrainCpuSampleCursor = (_terrainCpuSampleCursor + 1) % _terrainCpuSamples.Length;
MaybeFlushTerrainDiag();
// Conditional depth clear: when camera is inside a building, clear
// depth (not color) so interior geometry writes fresh Z values on top
@ -8713,6 +8747,61 @@ public sealed class GameWindow : IDisposable
}
}
/// <summary>Phase N.5b: emits [TERRAIN-DIAG] once per ~5s under
/// ACDREAM_WB_DIAG=1. Mirrors <c>WbDrawDispatcher.MaybeFlushDiag</c>:
/// rolling 256-sample buffer of microseconds, median + p95 reported.
/// Sample buffer is NOT cleared on flush — it's a moving window so the
/// next 5s window already has 256 frames of recent history.</summary>
private void MaybeFlushTerrainDiag()
{
if (!string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal))
return;
long now = Environment.TickCount64;
if (now - _terrainLastDiagTick <= 5000) return;
// Samples are stored as microseconds × 100 (so 1.23 µs becomes 123 long).
long cpuMedHundredthsUs = TerrainDiagMedianMicros(_terrainCpuSamples);
long cpuP95HundredthsUs = TerrainDiagPercentile95Micros(_terrainCpuSamples);
double cpuMedUs = cpuMedHundredthsUs / 100.0;
double cpuP95Us = cpuP95HundredthsUs / 100.0;
Console.WriteLine(
$"[TERRAIN-DIAG] cpu_us={cpuMedUs:F2}m/{cpuP95Us:F2}p95 " +
$"draws={_terrain?.VisibleSlots ?? 0}/frame " +
$"visible={_terrain?.VisibleSlots ?? 0} " +
$"loaded={_terrain?.LoadedSlots ?? 0} " +
$"capacity={_terrain?.CapacitySlots ?? 0}");
_terrainLastDiagTick = now;
}
private static long TerrainDiagMedianMicros(long[] samples)
{
var copy = (long[])samples.Clone();
Array.Sort(copy);
int nz = 0;
foreach (var v in copy) if (v > 0) nz++;
if (nz == 0) return 0;
// Sorted ascending: zero-padding at the front, samples at the back.
// Median of nz samples is the middle of the last nz entries; using
// (nz - 1) / 2 from the end keeps the offset >= 0 for all nz >= 1
// (the original nz / 2 form underflowed to copy.Length on first
// diag-flush when only 1 sample had been recorded).
return copy[copy.Length - 1 - (nz - 1) / 2];
}
private static long TerrainDiagPercentile95Micros(long[] samples)
{
var copy = (long[])samples.Clone();
Array.Sort(copy);
int nz = 0;
foreach (var v in copy) if (v > 0) nz++;
if (nz == 0) return 0;
// 95th percentile = upper end of the sorted samples; clamp the
// offset to stay inside the populated tail when nz < 20.
int offset = (int)((nz - 1) * 0.05);
return copy[copy.Length - 1 - offset];
}
private void OnClosing()
{
// Phase A.1: join the streamer worker thread before tearing down GL
@ -8732,7 +8821,7 @@ public sealed class GameWindow : IDisposable
_meshShader?.Dispose();
_terrain?.Dispose();
_shader?.Dispose();
_terrainModernShader?.Dispose();
_sceneLightingUbo?.Dispose();
_particleRenderer?.Dispose();
_debugLines?.Dispose();

View file

@ -1,9 +1,18 @@
#version 430 core
// Per-cell terrain blending (Phase 3c.4) — ported from WorldBuilder's
// Landscape.frag, trimmed of editor-specific features (grid, brush,
// walkable-slope highlighting). Phase G extends this with the shared
// SceneLighting UBO driving per-vertex sun bake + fragment-stage fog
// + lightning flash.
#version 460 core
#extension GL_ARB_bindless_texture : require
// Phase N.5b: terrain fragment shader on the modern bindless dispatcher.
// Math identical to terrain.frag (Phase 3c per-cell maskBlend3 +
// Phase G fog + lightning flash).
//
// Bindless texture handles are passed as uvec2 (low/high 32 bits) and
// reconstructed into sampler2DArray at use sites via the GLSL
// sampler-from-handle constructor. The alternative pattern —
// `uniform sampler2DArray` set via glProgramUniformHandleARB — produces
// GL_INVALID_OPERATION on at least one driver in practice (NVIDIA on
// Windows). The uvec2 + constructor pattern is what N.5's mesh_modern
// shader uses and is the documented "always works" form per the
// ARB_bindless_texture spec.
in vec2 vBaseUV;
in vec3 vWorldNormal;
@ -18,11 +27,11 @@ flat in float vBaseTexIdx;
out vec4 fragColor;
uniform sampler2DArray uTerrain; // 33+ layers — TerrainAtlas.GlTexture
uniform sampler2DArray uAlpha; // 8+ layers — TerrainAtlas.GlAlphaTexture
uniform uvec2 uTerrainHandle;
uniform uvec2 uAlphaHandle;
#define uTerrain sampler2DArray(uTerrainHandle)
#define uAlpha sampler2DArray(uAlphaHandle)
// Shared scene-lighting UBO — fog + flash are consumed here; the per-vertex
// AdjustPlanes bake already incorporated sun + ambient.
struct Light {
vec4 posAndKind;
vec4 dirAndRange;
@ -37,12 +46,8 @@ layout(std140, binding = 1) uniform SceneLighting {
vec4 uCameraAndTime;
};
// Per-texture tiling repeat count across a cell. WorldBuilder uses
// uTexTiling[36] uploaded from the dats; we default to 1.0 (one tile per
// cell, 8 tiles across a landblock).
const float TILE = 1.0;
// Three-layer alpha-weighted composite.
vec4 maskBlend3(vec4 t0, vec4 t1, vec4 t2, float h0, float h1, float h2) {
float a0 = h0 == 0.0 ? 1.0 : t0.a;
float a1 = h1 == 0.0 ? 1.0 : t1.a;
@ -129,20 +134,16 @@ void main() {
if (vRoad0.z >= 0.0)
roads = combineRoad(vBaseUV, vRoad0, vRoad1);
// Composite: base × (1 - ovlA) × (1 - rdA) + ovl × ovlA × (1 - rdA) + road × rdA
vec3 baseMasked = baseColor.rgb * ((1.0 - overlays.a) * (1.0 - roads.a));
vec3 ovlMasked = overlays.rgb * (overlays.a * (1.0 - roads.a));
vec3 roadMasked = roads.rgb * roads.a;
vec3 rgb = clamp(baseMasked + ovlMasked + roadMasked, 0.0, 1.0);
// Apply the per-vertex baked sun+ambient.
vec3 lit = rgb * min(vLightingRGB, vec3(1.0));
// Lightning flash — additive.
float flash = uFogParams.z;
lit += flash * vec3(0.6, 0.6, 0.75);
// Atmospheric fog.
lit = applyFog(lit, vWorldPos);
fragColor = vec4(lit, 1.0);

View file

@ -1,17 +1,21 @@
#version 430 core
#version 460 core
#extension GL_ARB_bindless_texture : require
// Phase N.5b: terrain shader on the modern bindless dispatcher.
// Math identical to terrain.vert (Phase 3c per-cell mesh + Phase G AdjustPlanes
// lighting). The only structural change is the version + bindless extension
// — sampler access in the fragment stage is unchanged at the GLSL level.
layout(location = 0) in vec3 aPos;
layout(location = 1) in vec3 aNormal;
layout(location = 2) in uvec4 aPacked0; // bytes: baseTex, baseAlpha(255), ovl0Tex, ovl0Alpha
layout(location = 3) in uvec4 aPacked1; // bytes: ovl1Tex, ovl1Alpha, ovl2Tex, ovl2Alpha
layout(location = 4) in uvec4 aPacked2; // bytes: road0Tex, road0Alpha, road1Tex, road1Alpha
layout(location = 5) in uvec4 aPacked3; // bits: rot fields + splitDir (see below)
layout(location = 2) in uvec4 aPacked0;
layout(location = 3) in uvec4 aPacked1;
layout(location = 4) in uvec4 aPacked2;
layout(location = 5) in uvec4 aPacked3;
uniform mat4 uView;
uniform mat4 uProjection;
// Phase G.1+G.2: sky/scene UBO. Terrain reads uLights[0] for the sun
// (slot 0 is reserved) plus uCellAmbient for outdoor ambient; the fog
// fields are consumed by the fragment stage.
struct Light {
vec4 posAndKind;
vec4 dirAndRange;
@ -29,9 +33,7 @@ layout(std140, binding = 1) uniform SceneLighting {
out vec2 vBaseUV;
out vec3 vWorldNormal;
out vec3 vWorldPos;
out vec3 vLightingRGB; // pre-computed sun+ambient contribution for retail-style AdjustPlanes bake
// Per-layer "UV.xy in cell-local 0..1 space, tex index .z, alpha index .w".
// Negative .z means "layer not present, skip it in the fragment shader."
out vec3 vLightingRGB;
out vec4 vOverlay0;
out vec4 vOverlay1;
out vec4 vOverlay2;
@ -53,9 +55,6 @@ flat out float vBaseTexIdx;
// Cross-ref: docs/research/2026-04-24-lambert-brightness-split.md.
const float MIN_FACTOR = 0.0;
// Port of WorldBuilder's Landscape.vert unpackOverlayLayer: sentinel-check
// 255 → -1 (shader skips), then rotate the cell-local UV by the overlay's
// 90° rotation count.
vec4 unpackOverlayLayer(uint texIdxU, uint alphaIdxU, uint rotIdx, vec2 baseUV) {
float texIdx = float(texIdxU);
float alphaIdx = float(alphaIdxU);
@ -121,15 +120,9 @@ void main() {
vWorldPos = aPos;
vWorldNormal = normalize(aNormal);
// Retail AdjustPlanes bake (r13 §7):
// L = max(N · -sunDir, MIN_FACTOR)
// vertex.color = sun_color * L + ambient_color
//
// Slot 0 of the UBO is the sun (directional). We read its forward
// vector and pre-multiplied color, apply the ambient floor, layer
// in the scene ambient separately.
vec3 sunDir = uLights[0].dirAndRange.xyz;
vec3 sunCol = uLights[0].colorAndIntensity.xyz * uLights[0].colorAndIntensity.w;
// Retail AdjustPlanes bake (terrain.vert:124-134 — identical math).
vec3 sunDir = uLights[0].dirAndRange.xyz;
vec3 sunCol = uLights[0].colorAndIntensity.xyz * uLights[0].colorAndIntensity.w;
float L = max(dot(vWorldNormal, -sunDir), MIN_FACTOR);
vLightingRGB = sunCol * L + uCellAmbient.xyz;

View file

@ -53,14 +53,45 @@ public sealed unsafe class TerrainAtlas : IDisposable
/// <summary>RCode for each RoadMap, parallel to <see cref="RoadAlphaLayers"/>.</summary>
public IReadOnlyList<uint> RoadAlphaRCodes { get; }
private readonly Wb.BindlessSupport? _bindless;
// Cached bindless handles. Generated lazily on first GetBindlessHandles() call;
// reused for the lifetime of the atlas.
private ulong _terrainHandle;
private ulong _alphaHandle;
private bool _handlesGenerated;
/// <summary>
/// Get 64-bit bindless handles for the terrain + alpha texture arrays.
/// Throws <see cref="InvalidOperationException"/> if the atlas was constructed
/// without a <see cref="Wb.BindlessSupport"/> instance. Handles are generated
/// lazily on first call and cached for the atlas's lifetime; both textures
/// are made resident.
/// </summary>
public (ulong terrain, ulong alpha) GetBindlessHandles()
{
if (_bindless is null)
throw new InvalidOperationException(
"TerrainAtlas was constructed without BindlessSupport; cannot return bindless handles.");
if (!_handlesGenerated)
{
_terrainHandle = _bindless.GetResidentHandle(GlTexture);
_alphaHandle = _bindless.GetResidentHandle(GlAlphaTexture);
_handlesGenerated = true;
}
return (_terrainHandle, _alphaHandle);
}
private TerrainAtlas(
GL gl,
Wb.BindlessSupport? bindless,
uint glTexture, IReadOnlyDictionary<uint, uint> map, int layerCount,
uint glAlphaTexture, int alphaLayerCount,
IReadOnlyList<byte> cornerLayers, IReadOnlyList<byte> sideLayers, IReadOnlyList<byte> roadLayers,
IReadOnlyList<uint> cornerTCodes, IReadOnlyList<uint> sideTCodes, IReadOnlyList<uint> roadRCodes)
{
_gl = gl;
_bindless = bindless;
GlTexture = glTexture;
TerrainTypeToLayer = map;
LayerCount = layerCount;
@ -79,7 +110,7 @@ public sealed unsafe class TerrainAtlas : IDisposable
/// for the mapping from TerrainTextureType to SurfaceTexture id, decoding each
/// to RGBA8, and uploading as layers in a single GL_TEXTURE_2D_ARRAY.
/// </summary>
public static TerrainAtlas Build(GL gl, DatCollection dats)
public static TerrainAtlas Build(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null)
{
var region = dats.Get<Region>(0x13000000u)
?? throw new InvalidOperationException("Region dat id 0x13000000 missing");
@ -89,7 +120,7 @@ public sealed unsafe class TerrainAtlas : IDisposable
if (terrainDesc is null || terrainDesc.Count == 0)
{
Console.WriteLine("WARN: TerrainDesc missing, using single white fallback layer");
return BuildFallback(gl);
return BuildFallback(gl, bindless);
}
// ---- Terrain atlas (unchanged Phase 2b logic) ----
@ -167,6 +198,7 @@ public sealed unsafe class TerrainAtlas : IDisposable
return new TerrainAtlas(
gl,
bindless,
tex, map, layerCount,
alphaBuild.gl, alphaBuild.layerCount,
alphaBuild.corner, alphaBuild.side, alphaBuild.road,
@ -350,7 +382,7 @@ public sealed unsafe class TerrainAtlas : IDisposable
return dst;
}
private static TerrainAtlas BuildFallback(GL gl)
private static TerrainAtlas BuildFallback(GL gl, Wb.BindlessSupport? bindless = null)
{
uint tex = gl.GenTexture();
gl.BindTexture(TextureTarget.Texture2DArray, tex);
@ -372,6 +404,7 @@ public sealed unsafe class TerrainAtlas : IDisposable
return new TerrainAtlas(
gl,
bindless,
tex, new Dictionary<uint, uint> { [0] = 0u }, 1,
alphaTex, 1,
Array.Empty<byte>(), Array.Empty<byte>(), Array.Empty<byte>(),
@ -380,6 +413,16 @@ public sealed unsafe class TerrainAtlas : IDisposable
public void Dispose()
{
// Phase 1: release bindless residency BEFORE deleting textures.
// ARB_bindless_texture requires this ordering; interleaving is UB.
if (_handlesGenerated && _bindless is not null)
{
_bindless.MakeNonResident(_terrainHandle);
_bindless.MakeNonResident(_alphaHandle);
_handlesGenerated = false;
}
// Phase 2: delete the underlying GL textures.
_gl.DeleteTexture(GlTexture);
_gl.DeleteTexture(GlAlphaTexture);
}

View file

@ -1,454 +0,0 @@
using System.Numerics;
using AcDream.Core.Terrain;
using Silk.NET.OpenGL;
namespace AcDream.App.Rendering;
/// <summary>
/// Chunk-based terrain renderer matching ACME's architecture. Each 16x16
/// landblock region gets its own VAO/VBO/EBO with pre-allocated max-size
/// buffers. Landblocks are added/removed incrementally via glBufferSubData
/// instead of rebuilding the entire buffer.
///
/// Attribute layout (same as TerrainRenderer, see TerrainVertex):
/// location 0: vec3 aPos (3 floats, world space)
/// location 1: vec3 aNormal (3 floats)
/// location 2: uvec4 aPacked0 (4 bytes, Data0)
/// location 3: uvec4 aPacked1 (4 bytes, Data1)
/// location 4: uvec4 aPacked2 (4 bytes, Data2)
/// location 5: uvec4 aPacked3 (4 bytes, Data3)
/// </summary>
public sealed unsafe class TerrainChunkRenderer : IDisposable
{
// -------------------------------------------------------------------------
// Constants
// -------------------------------------------------------------------------
/// <summary>Number of landblocks per chunk dimension (matching ACME).</summary>
public const int ChunkSizeInLandblocks = 16;
/// <summary>Max landblock slots per chunk (16x16 = 256).</summary>
public const int SlotsPerChunk = ChunkSizeInLandblocks * ChunkSizeInLandblocks;
/// <summary>Vertices per landblock: 64 cells x 6 verts = 384.</summary>
public const int VerticesPerLandblock = LandblockMesh.VerticesPerLandblock;
/// <summary>Indices per landblock (trivial 0..383, same count as vertices).</summary>
public const int IndicesPerLandblock = VerticesPerLandblock;
/// <summary>Byte size of one TerrainVertex (40 bytes).</summary>
private static readonly int VertexSize = sizeof(TerrainVertex);
/// <summary>Max VBO size per chunk: 256 slots x 384 verts x 40 bytes = ~3.75 MB.</summary>
private static readonly nuint MaxVboBytes =
(nuint)(SlotsPerChunk * VerticesPerLandblock * VertexSize);
/// <summary>Max EBO size per chunk: 256 slots x 384 indices x 4 bytes = ~393 KB.</summary>
private static readonly nuint MaxEboBytes =
(nuint)(SlotsPerChunk * IndicesPerLandblock * sizeof(uint));
// -------------------------------------------------------------------------
// Fields
// -------------------------------------------------------------------------
private readonly GL _gl;
private readonly Shader _shader;
private readonly TerrainAtlas _atlas;
/// <summary>Active chunks keyed by (chunkX, chunkY) packed into a ulong.</summary>
private readonly Dictionary<ulong, ChunkData> _chunks = new();
/// <summary>Reverse map: landblockId -> chunkId, for fast RemoveLandblock.</summary>
private readonly Dictionary<uint, ulong> _landblockToChunk = new();
// -------------------------------------------------------------------------
// Construction
// -------------------------------------------------------------------------
public TerrainChunkRenderer(GL gl, Shader shader, TerrainAtlas atlas)
{
_gl = gl;
_shader = shader;
_atlas = atlas;
}
// -------------------------------------------------------------------------
// Public API
// -------------------------------------------------------------------------
/// <summary>
/// Add (or replace) a landblock's terrain mesh. Vertices are baked to world
/// space using <paramref name="worldOrigin"/>, then uploaded to the correct
/// chunk buffer slot via glBufferSubData.
/// </summary>
public void AddLandblock(uint landblockId, LandblockMeshData meshData, Vector3 worldOrigin)
{
// If this landblock already exists, remove it first.
if (_landblockToChunk.ContainsKey(landblockId))
RemoveLandblock(landblockId);
// Determine chunk coordinates and slot index.
// Landblock ID format: 0xXXYYnnnn (X at bits 24-31, Y at bits 16-23).
int lbX = (int)(landblockId >> 24) & 0xFF;
int lbY = (int)(landblockId >> 16) & 0xFF;
int chunkX = lbX / ChunkSizeInLandblocks;
int chunkY = lbY / ChunkSizeInLandblocks;
ulong chunkId = PackChunkId(chunkX, chunkY);
int localX = lbX % ChunkSizeInLandblocks;
int localY = lbY % ChunkSizeInLandblocks;
int slotIndex = localX * ChunkSizeInLandblocks + localY;
// Create chunk on demand.
if (!_chunks.TryGetValue(chunkId, out var chunk))
{
chunk = CreateChunk(chunkX, chunkY);
_chunks[chunkId] = chunk;
}
// Bake world-space vertices.
var worldVerts = new TerrainVertex[meshData.Vertices.Length];
float zMin = float.MaxValue, zMax = float.MinValue;
for (int i = 0; i < meshData.Vertices.Length; i++)
{
var v = meshData.Vertices[i];
var worldPos = v.Position + worldOrigin;
worldVerts[i] = new TerrainVertex(worldPos, v.Normal, v.Data0, v.Data1, v.Data2, v.Data3);
if (worldPos.Z < zMin) zMin = worldPos.Z;
if (worldPos.Z > zMax) zMax = worldPos.Z;
}
if (zMin == float.MaxValue) { zMin = 0f; zMax = 0f; }
// Upload vertices into the slot's region of the VBO.
nint vboOffset = (nint)(slotIndex * VerticesPerLandblock * VertexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, chunk.Vbo);
fixed (void* p = worldVerts)
{
_gl.BufferSubData(BufferTargetARB.ArrayBuffer, vboOffset,
(nuint)(worldVerts.Length * VertexSize), p);
}
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
// Track the slot.
chunk.Slots[slotIndex] = new LandblockSlot
{
LandblockId = landblockId,
WorldOrigin = worldOrigin,
MinZ = zMin,
MaxZ = zMax,
};
chunk.Occupied.Add(slotIndex);
_landblockToChunk[landblockId] = chunkId;
// Rebuild the EBO for this chunk (only includes occupied slots).
RebuildChunkEbo(chunk);
// Update chunk AABB.
UpdateChunkBounds(chunk);
}
/// <summary>
/// Remove a landblock from its chunk. If the chunk becomes empty, dispose it.
/// </summary>
public void RemoveLandblock(uint landblockId)
{
if (!_landblockToChunk.TryGetValue(landblockId, out var chunkId))
return;
_landblockToChunk.Remove(landblockId);
if (!_chunks.TryGetValue(chunkId, out var chunk))
return;
// Find which slot this landblock occupies.
int slotIndex = -1;
foreach (var s in chunk.Occupied)
{
if (chunk.Slots[s].LandblockId == landblockId)
{
slotIndex = s;
break;
}
}
if (slotIndex < 0)
return;
// Zero out the VBO region for this slot (optional but clean).
nint vboOffset = (nint)(slotIndex * VerticesPerLandblock * VertexSize);
nuint vboSize = (nuint)(VerticesPerLandblock * VertexSize);
var zeros = new byte[VerticesPerLandblock * VertexSize];
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, chunk.Vbo);
fixed (void* p = zeros)
{
_gl.BufferSubData(BufferTargetARB.ArrayBuffer, vboOffset, vboSize, p);
}
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
chunk.Slots[slotIndex] = default;
chunk.Occupied.Remove(slotIndex);
if (chunk.Occupied.Count == 0)
{
// Chunk is empty -- dispose GPU resources.
chunk.Dispose(_gl);
_chunks.Remove(chunkId);
}
else
{
RebuildChunkEbo(chunk);
UpdateChunkBounds(chunk);
}
}
/// <summary>
/// Draw all visible terrain chunks. One glDrawElements per non-empty chunk.
/// Frustum culling is performed at the chunk AABB level.
/// </summary>
public void Draw(ICamera camera, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null)
{
if (_chunks.Count == 0)
return;
// Determine which chunk the never-cull landblock lives in.
ulong? neverCullChunkId = null;
if (neverCullLandblockId is not null && _landblockToChunk.TryGetValue(neverCullLandblockId.Value, out var ncId))
neverCullChunkId = ncId;
_shader.Use();
_shader.SetMatrix4("uView", camera.View);
_shader.SetMatrix4("uProjection", camera.Projection);
// Phase G: light direction + ambient + fog come from the shared
// SceneLighting UBO (binding=1) uploaded by GameWindow once per
// frame. Terrain bakes per-vertex AdjustPlanes lighting (r13 §7)
// from the UBO's slot-0 sun + uCellAmbient, then the fragment
// stage adds fog + lightning flash. No per-program uniforms here.
// Terrain atlas on unit 0, alpha atlas on unit 1.
_gl.ActiveTexture(TextureUnit.Texture0);
_gl.BindTexture(TextureTarget.Texture2DArray, _atlas.GlTexture);
_gl.ActiveTexture(TextureUnit.Texture1);
_gl.BindTexture(TextureTarget.Texture2DArray, _atlas.GlAlphaTexture);
int terrainLoc = _gl.GetUniformLocation(_shader.Program, "uTerrain");
if (terrainLoc >= 0) _gl.Uniform1(terrainLoc, 0);
int alphaLoc = _gl.GetUniformLocation(_shader.Program, "uAlpha");
if (alphaLoc >= 0) _gl.Uniform1(alphaLoc, 1);
foreach (var (chunkId, chunk) in _chunks)
{
if (chunk.IndexCount == 0)
continue;
// Chunk-level frustum cull.
if (frustum is not null && chunkId != neverCullChunkId)
{
if (!FrustumCuller.IsAabbVisible(frustum.Value, chunk.AabbMin, chunk.AabbMax))
continue;
}
_gl.BindVertexArray(chunk.Vao);
_gl.DrawElements(
PrimitiveType.Triangles,
(uint)chunk.IndexCount,
DrawElementsType.UnsignedInt,
(void*)0);
}
_gl.BindVertexArray(0);
}
public void Dispose()
{
foreach (var chunk in _chunks.Values)
chunk.Dispose(_gl);
_chunks.Clear();
_landblockToChunk.Clear();
}
// -------------------------------------------------------------------------
// Private helpers
// -------------------------------------------------------------------------
private static ulong PackChunkId(int chunkX, int chunkY)
=> ((ulong)(uint)chunkX << 32) | (uint)chunkY;
/// <summary>
/// Allocate a new chunk with max-size VBO and empty EBO, plus a configured VAO.
/// </summary>
private ChunkData CreateChunk(int chunkX, int chunkY)
{
var chunk = new ChunkData
{
ChunkX = chunkX,
ChunkY = chunkY,
Vao = _gl.GenVertexArray(),
Vbo = _gl.GenBuffer(),
Ebo = _gl.GenBuffer(),
};
// Pre-allocate VBO to max size with DynamicDraw.
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, chunk.Vbo);
_gl.BufferData(BufferTargetARB.ArrayBuffer, MaxVboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
// Pre-allocate EBO (empty initially, will be rebuilt on first AddLandblock).
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, chunk.Ebo);
_gl.BufferData(BufferTargetARB.ElementArrayBuffer, MaxEboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, 0);
// Configure VAO with the same attribute layout as the old TerrainRenderer.
ConfigureVao(chunk);
return chunk;
}
/// <summary>
/// Set up vertex attribute pointers on the chunk's VAO. Identical layout
/// to the old TerrainRenderer.
/// </summary>
private void ConfigureVao(ChunkData chunk)
{
_gl.BindVertexArray(chunk.Vao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, chunk.Vbo);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, chunk.Ebo);
uint stride = (uint)VertexSize;
// location 0: Position (12 bytes)
_gl.EnableVertexAttribArray(0);
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
// location 1: Normal (12 bytes, offset 12)
_gl.EnableVertexAttribArray(1);
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
// location 2..5: Data0..Data3 as uvec4 byte attributes (4 bytes each, offsets 24, 28, 32, 36).
nint dataOffset = 6 * sizeof(float); // 24 bytes
_gl.EnableVertexAttribArray(2);
_gl.VertexAttribIPointer(2, 4, VertexAttribIType.UnsignedByte, stride, (void*)dataOffset);
_gl.EnableVertexAttribArray(3);
_gl.VertexAttribIPointer(3, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 4));
_gl.EnableVertexAttribArray(4);
_gl.VertexAttribIPointer(4, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 8));
_gl.EnableVertexAttribArray(5);
_gl.VertexAttribIPointer(5, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 12));
_gl.BindVertexArray(0);
}
/// <summary>
/// Rebuild the EBO for a chunk, emitting rebased indices only for occupied
/// slots. Each slot's indices are offset by (slotIndex * VerticesPerLandblock)
/// so they point to the correct region of the VBO.
/// </summary>
private void RebuildChunkEbo(ChunkData chunk)
{
int totalIndices = chunk.Occupied.Count * IndicesPerLandblock;
var indices = new uint[totalIndices];
int writePos = 0;
foreach (var slotIndex in chunk.Occupied)
{
uint vertexBase = (uint)(slotIndex * VerticesPerLandblock);
for (uint i = 0; i < IndicesPerLandblock; i++)
indices[writePos++] = vertexBase + i;
}
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, chunk.Ebo);
fixed (void* p = indices)
{
_gl.BufferSubData(BufferTargetARB.ElementArrayBuffer, 0,
(nuint)(totalIndices * sizeof(uint)), p);
}
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, 0);
chunk.IndexCount = totalIndices;
}
/// <summary>
/// Recompute the chunk's world-space AABB from all occupied landblock slots.
/// </summary>
private static void UpdateChunkBounds(ChunkData chunk)
{
float minX = float.MaxValue, minY = float.MaxValue, minZ = float.MaxValue;
float maxX = float.MinValue, maxY = float.MinValue, maxZ = float.MinValue;
foreach (var slotIndex in chunk.Occupied)
{
var slot = chunk.Slots[slotIndex];
float ox = slot.WorldOrigin.X;
float oy = slot.WorldOrigin.Y;
if (ox < minX) minX = ox;
if (oy < minY) minY = oy;
if (slot.MinZ < minZ) minZ = slot.MinZ;
float ex = ox + LandblockMesh.LandblockSize;
float ey = oy + LandblockMesh.LandblockSize;
if (ex > maxX) maxX = ex;
if (ey > maxY) maxY = ey;
if (slot.MaxZ > maxZ) maxZ = slot.MaxZ;
}
if (minX == float.MaxValue)
{
chunk.AabbMin = Vector3.Zero;
chunk.AabbMax = Vector3.Zero;
}
else
{
chunk.AabbMin = new Vector3(minX, minY, minZ);
chunk.AabbMax = new Vector3(maxX, maxY, maxZ);
}
}
// -------------------------------------------------------------------------
// Inner types
// -------------------------------------------------------------------------
/// <summary>
/// Per-landblock slot tracking within a chunk's VBO.
/// </summary>
private struct LandblockSlot
{
public uint LandblockId;
public Vector3 WorldOrigin;
public float MinZ;
public float MaxZ;
}
/// <summary>
/// GPU resources and metadata for a single 16x16 terrain chunk.
/// </summary>
private sealed class ChunkData
{
public int ChunkX;
public int ChunkY;
// GPU handles.
public uint Vao;
public uint Vbo;
public uint Ebo;
/// <summary>Per-slot landblock data. Indexed by (localX * 16 + localY).</summary>
public readonly LandblockSlot[] Slots = new LandblockSlot[SlotsPerChunk];
/// <summary>Set of occupied slot indices within this chunk.</summary>
public readonly HashSet<int> Occupied = new();
/// <summary>Current number of valid indices in the EBO (set by RebuildChunkEbo).</summary>
public int IndexCount;
/// <summary>World-space AABB for chunk-level frustum culling.</summary>
public Vector3 AabbMin;
public Vector3 AabbMax;
public void Dispose(GL gl)
{
gl.DeleteVertexArray(Vao);
gl.DeleteBuffer(Vbo);
gl.DeleteBuffer(Ebo);
}
}
}

View file

@ -0,0 +1,360 @@
using System.Numerics;
using AcDream.App.Rendering.Wb;
using AcDream.Core.Terrain;
using Silk.NET.OpenGL;
namespace AcDream.App.Rendering;
/// <summary>
/// Phase N.5b modern terrain dispatcher. Single global VBO/EBO with a slot
/// allocator (one slot per landblock, 384 verts × 40 bytes = 15,360 bytes
/// per slot). Per-frame: build a DrawElementsIndirectCommand array from
/// visible slots, upload, dispatch via glMultiDrawElementsIndirect. Atlas
/// textures bound via bindless handles set per-frame as sampler uniforms.
///
/// Total ~6-8 GL calls per frame for terrain regardless of visible
/// landblock count.
/// </summary>
public sealed unsafe class TerrainModernRenderer : IDisposable
{
// VertsPerLandblock MUST stay divisible by 6 — terrain_modern.vert uses
// `gl_VertexID % 6` to pick the cell-corner index (BL/BR/TR/TL), and
// because we bake `slot * VertsPerLandblock` into indices CPU-side and
// pass BaseVertex=0 to MultiDrawElementsIndirect, gl_VertexID becomes
// `slot * VertsPerLandblock + local_index`. The shader's modulo-6 only
// reduces to `local_index % 6` because 384 is a multiple of 6. Changing
// either constant without auditing the shader will silently mis-render.
private const int VertsPerLandblock = LandblockMesh.VerticesPerLandblock; // 384 (= 64 cells * 6 verts)
private const int IndicesPerLandblock = VertsPerLandblock;
private const int VertexSize = 40; // sizeof(TerrainVertex)
private const int IndexSize = sizeof(uint);
private const float LandblockSize = LandblockMesh.LandblockSize; // 192
private readonly GL _gl;
private readonly BindlessSupport _bindless;
private readonly Shader _shader;
private readonly TerrainAtlas _atlas;
private readonly TerrainSlotAllocator _alloc;
// Per-slot live data (index by slot integer; null entries are unused slots).
private SlotData?[] _slots;
// Reverse map: landblockId -> slot, for RemoveLandblock and replacement.
private readonly Dictionary<uint, int> _idToSlot = new();
// GPU buffers.
private uint _globalVao;
private uint _globalVbo;
private uint _globalEbo;
private uint _indirectBuffer;
private int _indirectCapacity;
// Cached uvec2-handle uniform locations (matrix uniforms are set by name via Shader.SetMatrix4).
private int _uTerrainHandleLoc;
private int _uAlphaHandleLoc;
// Reusable per-frame buffers.
private readonly List<int> _visibleSlots = new();
private DrawElementsIndirectCommand[] _deicScratch = Array.Empty<DrawElementsIndirectCommand>();
// Diag.
public int LoadedSlots => _alloc.LoadedCount;
public int VisibleSlots => _visibleSlots.Count;
public int CapacitySlots => _alloc.Capacity;
public TerrainModernRenderer(
GL gl,
BindlessSupport bindless,
Shader shader,
TerrainAtlas atlas,
int initialSlotCapacity = 64)
{
_gl = gl;
_bindless = bindless;
_shader = shader;
_atlas = atlas;
_alloc = new TerrainSlotAllocator(initialSlotCapacity);
_slots = new SlotData?[initialSlotCapacity];
_uTerrainHandleLoc = _gl.GetUniformLocation(_shader.Program, "uTerrainHandle");
_uAlphaHandleLoc = _gl.GetUniformLocation(_shader.Program, "uAlphaHandle");
_globalVao = _gl.GenVertexArray();
_globalVbo = _gl.GenBuffer();
_globalEbo = _gl.GenBuffer();
AllocateGpuBuffers(initialSlotCapacity);
ConfigureVao();
_indirectBuffer = _gl.GenBuffer();
}
public void AddLandblock(uint landblockId, LandblockMeshData meshData, Vector3 worldOrigin)
{
ArgumentNullException.ThrowIfNull(meshData);
if (meshData.Vertices.Length != VertsPerLandblock)
throw new ArgumentException(
$"Expected {VertsPerLandblock} vertices, got {meshData.Vertices.Length}",
nameof(meshData));
if (meshData.Indices.Length != IndicesPerLandblock)
throw new ArgumentException(
$"Expected {IndicesPerLandblock} indices, got {meshData.Indices.Length}",
nameof(meshData));
if (_idToSlot.ContainsKey(landblockId))
RemoveLandblock(landblockId);
int slot = _alloc.Allocate(out var needsGrow);
if (needsGrow)
{
int newCap = Math.Max(_alloc.Capacity * 2, slot + 1);
EnsureCapacity(newCap);
}
// Bake worldOrigin into vertex positions; capture min/max Z for AABB.
var bakedVerts = new TerrainVertex[VertsPerLandblock];
float zMin = float.MaxValue, zMax = float.MinValue;
for (int i = 0; i < VertsPerLandblock; i++)
{
var v = meshData.Vertices[i];
var worldPos = v.Position + worldOrigin;
bakedVerts[i] = new TerrainVertex(worldPos, v.Normal, v.Data0, v.Data1, v.Data2, v.Data3);
if (worldPos.Z < zMin) zMin = worldPos.Z;
if (worldPos.Z > zMax) zMax = worldPos.Z;
}
if (zMin == float.MaxValue) { zMin = 0f; zMax = 0f; }
// Bake baseVertex into indices on the CPU side (driver-portable pattern).
uint baseVertex = (uint)(slot * VertsPerLandblock);
var bakedIndices = new uint[IndicesPerLandblock];
for (int i = 0; i < IndicesPerLandblock; i++)
bakedIndices[i] = meshData.Indices[i] + baseVertex;
// glBufferSubData into the slot's VBO + EBO regions.
nint vboByteOffset = (nint)(slot * VertsPerLandblock * VertexSize);
nint eboByteOffset = (nint)(slot * IndicesPerLandblock * IndexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _globalVbo);
fixed (TerrainVertex* p = bakedVerts)
{
_gl.BufferSubData(BufferTargetARB.ArrayBuffer, vboByteOffset,
(nuint)(VertsPerLandblock * VertexSize), p);
}
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _globalEbo);
fixed (uint* p = bakedIndices)
{
_gl.BufferSubData(BufferTargetARB.ElementArrayBuffer, eboByteOffset,
(nuint)(IndicesPerLandblock * IndexSize), p);
}
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, 0);
_slots[slot] = new SlotData
{
LandblockId = landblockId,
WorldOrigin = worldOrigin,
FirstIndex = (uint)(slot * IndicesPerLandblock),
IndexCount = IndicesPerLandblock,
AabbMin = new Vector3(worldOrigin.X, worldOrigin.Y, zMin),
AabbMax = new Vector3(worldOrigin.X + LandblockSize, worldOrigin.Y + LandblockSize, zMax),
};
_idToSlot[landblockId] = slot;
}
public void RemoveLandblock(uint landblockId)
{
if (!_idToSlot.TryGetValue(landblockId, out var slot))
return;
_idToSlot.Remove(landblockId);
_slots[slot] = null;
_alloc.Free(slot);
// No GPU clear: the per-frame DEIC array won't reference this slot.
}
public void Draw(ICamera camera, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null)
{
if (_alloc.LoadedCount == 0) return;
// Build visible slot list with per-slot frustum cull.
_visibleSlots.Clear();
for (int slot = 0; slot < _slots.Length; slot++)
{
var data = _slots[slot];
if (data is null) continue;
if (frustum is not null && data.LandblockId != neverCullLandblockId)
{
if (!FrustumCuller.IsAabbVisible(frustum.Value, data.AabbMin, data.AabbMax))
continue;
}
_visibleSlots.Add(slot);
}
if (_visibleSlots.Count == 0) return;
// Build DEIC array.
if (_deicScratch.Length < _visibleSlots.Count)
_deicScratch = new DrawElementsIndirectCommand[Math.Max(_visibleSlots.Count, 64)];
for (int i = 0; i < _visibleSlots.Count; i++)
{
var data = _slots[_visibleSlots[i]]!;
_deicScratch[i] = new DrawElementsIndirectCommand
{
Count = (uint)data.IndexCount,
InstanceCount = 1u,
FirstIndex = data.FirstIndex,
BaseVertex = 0, // baked into indices on upload
BaseInstance = 0,
};
}
// Grow indirect buffer if needed.
if (_visibleSlots.Count > _indirectCapacity)
{
_indirectCapacity = Math.Max(64, _visibleSlots.Count * 2);
_gl.BindBuffer(GLEnum.DrawIndirectBuffer, _indirectBuffer);
_gl.BufferData(GLEnum.DrawIndirectBuffer,
(nuint)(_indirectCapacity * sizeof(DrawElementsIndirectCommand)),
null, GLEnum.DynamicDraw);
}
else
{
_gl.BindBuffer(GLEnum.DrawIndirectBuffer, _indirectBuffer);
}
// Upload DEIC array.
fixed (DrawElementsIndirectCommand* p = _deicScratch)
{
_gl.BufferSubData(GLEnum.DrawIndirectBuffer, 0,
(nuint)(_visibleSlots.Count * sizeof(DrawElementsIndirectCommand)), p);
}
// Bind shader + uniforms + atlas handles.
_shader.Use();
_shader.SetMatrix4("uView", camera.View);
_shader.SetMatrix4("uProjection", camera.Projection);
var (terrainHandle, alphaHandle) = _atlas.GetBindlessHandles();
// Pass each 64-bit handle as a uvec2 (low 32 bits, high 32 bits).
// GLSL constructs sampler2DArray(uTerrainHandle) at the use site —
// see terrain_modern.frag for why this is the safe pattern.
_gl.ProgramUniform2(_shader.Program, _uTerrainHandleLoc,
(uint)(terrainHandle & 0xFFFFFFFFu), (uint)(terrainHandle >> 32));
_gl.ProgramUniform2(_shader.Program, _uAlphaHandleLoc,
(uint)(alphaHandle & 0xFFFFFFFFu), (uint)(alphaHandle >> 32));
_gl.BindVertexArray(_globalVao);
_gl.MemoryBarrier(MemoryBarrierMask.CommandBarrierBit);
_gl.MultiDrawElementsIndirect(
PrimitiveType.Triangles, DrawElementsType.UnsignedInt,
(void*)0,
(uint)_visibleSlots.Count,
(uint)sizeof(DrawElementsIndirectCommand));
_gl.BindVertexArray(0);
_gl.BindBuffer(GLEnum.DrawIndirectBuffer, 0);
}
public void Dispose()
{
_gl.DeleteVertexArray(_globalVao);
_gl.DeleteBuffer(_globalVbo);
_gl.DeleteBuffer(_globalEbo);
_gl.DeleteBuffer(_indirectBuffer);
}
// ----------------------------------------------------------------
// Private helpers
// ----------------------------------------------------------------
private void AllocateGpuBuffers(int capacitySlots)
{
nuint vboBytes = (nuint)(capacitySlots * VertsPerLandblock * VertexSize);
nuint eboBytes = (nuint)(capacitySlots * IndicesPerLandblock * IndexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _globalVbo);
_gl.BufferData(BufferTargetARB.ArrayBuffer, vboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _globalEbo);
_gl.BufferData(BufferTargetARB.ElementArrayBuffer, eboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, 0);
}
private void ConfigureVao()
{
_gl.BindVertexArray(_globalVao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _globalVbo);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _globalEbo);
uint stride = (uint)VertexSize;
// location 0: Position
_gl.EnableVertexAttribArray(0);
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
// location 1: Normal
_gl.EnableVertexAttribArray(1);
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
// locations 2-5: Data0..Data3 (uvec4 byte attributes)
nint dataOffset = 6 * sizeof(float);
_gl.EnableVertexAttribArray(2);
_gl.VertexAttribIPointer(2, 4, VertexAttribIType.UnsignedByte, stride, (void*)dataOffset);
_gl.EnableVertexAttribArray(3);
_gl.VertexAttribIPointer(3, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 4));
_gl.EnableVertexAttribArray(4);
_gl.VertexAttribIPointer(4, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 8));
_gl.EnableVertexAttribArray(5);
_gl.VertexAttribIPointer(5, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 12));
_gl.BindVertexArray(0);
}
private void EnsureCapacity(int newCapacity)
{
if (newCapacity <= _alloc.Capacity) return;
// Allocate new VBO + EBO at new size; copy old contents; swap; recreate VAO.
uint newVbo = _gl.GenBuffer();
uint newEbo = _gl.GenBuffer();
nuint newVboBytes = (nuint)(newCapacity * VertsPerLandblock * VertexSize);
nuint newEboBytes = (nuint)(newCapacity * IndicesPerLandblock * IndexSize);
nuint oldVboBytes = (nuint)(_alloc.Capacity * VertsPerLandblock * VertexSize);
nuint oldEboBytes = (nuint)(_alloc.Capacity * IndicesPerLandblock * IndexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, newVbo);
_gl.BufferData(BufferTargetARB.ArrayBuffer, newVboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.CopyReadBuffer, _globalVbo);
_gl.BindBuffer(BufferTargetARB.CopyWriteBuffer, newVbo);
_gl.CopyBufferSubData(CopyBufferSubDataTarget.CopyReadBuffer, CopyBufferSubDataTarget.CopyWriteBuffer,
0, 0, oldVboBytes);
_gl.DeleteBuffer(_globalVbo);
_globalVbo = newVbo;
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, newEbo);
_gl.BufferData(BufferTargetARB.ElementArrayBuffer, newEboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.CopyReadBuffer, _globalEbo);
_gl.BindBuffer(BufferTargetARB.CopyWriteBuffer, newEbo);
_gl.CopyBufferSubData(CopyBufferSubDataTarget.CopyReadBuffer, CopyBufferSubDataTarget.CopyWriteBuffer,
0, 0, oldEboBytes);
_gl.DeleteBuffer(_globalEbo);
_globalEbo = newEbo;
// Recreate VAO with new buffer bindings.
_gl.DeleteVertexArray(_globalVao);
_globalVao = _gl.GenVertexArray();
ConfigureVao();
// Grow slot tracking array.
Array.Resize(ref _slots, newCapacity);
_alloc.GrowTo(newCapacity);
}
private sealed class SlotData
{
public uint LandblockId;
public Vector3 WorldOrigin;
public uint FirstIndex;
public int IndexCount;
public Vector3 AabbMin;
public Vector3 AabbMax;
}
}

View file

@ -1,247 +0,0 @@
using System.Numerics;
using AcDream.Core.Terrain;
using Silk.NET.OpenGL;
namespace AcDream.App.Rendering;
/// <summary>
/// Draws the Phase 3c per-cell terrain mesh. All loaded landblocks share a
/// single VBO + EBO + VAO. Vertex positions are baked in world space so no
/// uModel uniform is needed. The VAO is bound once per frame; each visible
/// landblock gets one glDrawElements call into its sub-range of the shared EBO.
///
/// Attribute layout (see TerrainVertex for the byte layout):
/// location 0: vec3 aPos (3 floats, world space)
/// location 1: vec3 aNormal (3 floats)
/// location 2: uvec4 aPacked0 (4 bytes, Data0)
/// location 3: uvec4 aPacked1 (4 bytes, Data1)
/// location 4: uvec4 aPacked2 (4 bytes, Data2)
/// location 5: uvec4 aPacked3 (4 bytes, Data3)
/// </summary>
public sealed unsafe class TerrainRenderer : IDisposable
{
private readonly GL _gl;
private readonly Shader _shader;
private readonly TerrainAtlas _atlas;
// Logical per-landblock data (CPU side).
private readonly Dictionary<uint, LandblockEntry> _entries = new();
// Shared GPU buffers — rebuilt whenever a landblock is added or removed.
private uint _vao;
private uint _vbo;
private uint _ebo;
private bool _gpuDirty = true; // true = buffers need rebuilding before next Draw
public TerrainRenderer(GL gl, Shader shader, TerrainAtlas atlas)
{
_gl = gl;
_shader = shader;
_atlas = atlas;
_vao = _gl.GenVertexArray();
_vbo = _gl.GenBuffer();
_ebo = _gl.GenBuffer();
ConfigureVao();
}
public void AddLandblock(uint landblockId, LandblockMeshData meshData, Vector3 worldOrigin)
{
if (_entries.ContainsKey(landblockId))
_entries.Remove(landblockId);
// Bake world-space positions: offset every vertex by worldOrigin.
var worldVerts = new TerrainVertex[meshData.Vertices.Length];
float zMin = float.MaxValue, zMax = float.MinValue;
for (int i = 0; i < meshData.Vertices.Length; i++)
{
var v = meshData.Vertices[i];
var worldPos = v.Position + worldOrigin;
worldVerts[i] = new TerrainVertex(worldPos, v.Normal, v.Data0, v.Data1, v.Data2, v.Data3);
if (worldPos.Z < zMin) zMin = worldPos.Z;
if (worldPos.Z > zMax) zMax = worldPos.Z;
}
if (zMin == float.MaxValue) { zMin = 0f; zMax = 0f; }
_entries[landblockId] = new LandblockEntry
{
LandblockId = landblockId,
WorldOrigin = worldOrigin,
Vertices = worldVerts,
Indices = meshData.Indices, // local 0..N-1; will be rebased on rebuild
MinZ = zMin,
MaxZ = zMax,
};
_gpuDirty = true;
}
public void RemoveLandblock(uint landblockId)
{
if (_entries.Remove(landblockId))
_gpuDirty = true;
}
public void Draw(ICamera camera, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null)
{
if (_entries.Count == 0)
return;
if (_gpuDirty)
RebuildGpuBuffers();
_shader.Use();
_shader.SetMatrix4("uView", camera.View);
_shader.SetMatrix4("uProjection", camera.Projection);
// Terrain atlas on unit 0, alpha atlas on unit 1.
_gl.ActiveTexture(TextureUnit.Texture0);
_gl.BindTexture(TextureTarget.Texture2DArray, _atlas.GlTexture);
_gl.ActiveTexture(TextureUnit.Texture1);
_gl.BindTexture(TextureTarget.Texture2DArray, _atlas.GlAlphaTexture);
int terrainLoc = _gl.GetUniformLocation(_shader.Program, "uTerrain");
if (terrainLoc >= 0) _gl.Uniform1(terrainLoc, 0);
int alphaLoc = _gl.GetUniformLocation(_shader.Program, "uAlpha");
if (alphaLoc >= 0) _gl.Uniform1(alphaLoc, 1);
// Bind the shared VAO once for the entire frame.
_gl.BindVertexArray(_vao);
foreach (var entry in _entries.Values)
{
// Per-landblock frustum cull using world-space AABB.
if (frustum is not null && entry.LandblockId != neverCullLandblockId)
{
var aabbMin = new Vector3(entry.WorldOrigin.X, entry.WorldOrigin.Y, entry.MinZ);
var aabbMax = new Vector3(entry.WorldOrigin.X + 192f, entry.WorldOrigin.Y + 192f, entry.MaxZ);
if (!FrustumCuller.IsAabbVisible(frustum.Value, aabbMin, aabbMax))
continue;
}
// Draw only this landblock's sub-range in the shared EBO.
// EboOffset is in bytes (uint = 4 bytes).
_gl.DrawElements(
PrimitiveType.Triangles,
(uint)entry.IndexCount,
DrawElementsType.UnsignedInt,
(void*)(entry.EboByteOffset));
}
_gl.BindVertexArray(0);
}
public void Dispose()
{
_gl.DeleteVertexArray(_vao);
_gl.DeleteBuffer(_vbo);
_gl.DeleteBuffer(_ebo);
_entries.Clear();
}
// -------------------------------------------------------------------------
// Private helpers
// -------------------------------------------------------------------------
private void ConfigureVao()
{
_gl.BindVertexArray(_vao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _vbo);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _ebo);
uint stride = (uint)sizeof(TerrainVertex);
// location 0: Position (12 bytes)
_gl.EnableVertexAttribArray(0);
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
// location 1: Normal (12 bytes, offset 12)
_gl.EnableVertexAttribArray(1);
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
// location 2..5: Data0..Data3 as uvec4 byte attributes (4 bytes each,
// offsets 24, 28, 32, 36).
nint dataOffset = 6 * sizeof(float); // 24 bytes
_gl.EnableVertexAttribArray(2);
_gl.VertexAttribIPointer(2, 4, VertexAttribIType.UnsignedByte, stride, (void*)dataOffset);
_gl.EnableVertexAttribArray(3);
_gl.VertexAttribIPointer(3, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 4));
_gl.EnableVertexAttribArray(4);
_gl.VertexAttribIPointer(4, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 8));
_gl.EnableVertexAttribArray(5);
_gl.VertexAttribIPointer(5, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 12));
_gl.BindVertexArray(0);
}
/// <summary>
/// Concatenate all loaded landblocks into a single VBO + EBO and upload.
/// Called on the cold path (landblock load / unload), not per frame.
/// </summary>
private void RebuildGpuBuffers()
{
// Measure totals.
int totalVerts = 0;
int totalIndices = 0;
foreach (var e in _entries.Values)
{
totalVerts += e.Vertices.Length;
totalIndices += e.Indices.Length;
}
var allVerts = new TerrainVertex[totalVerts];
var allIndices = new uint[totalIndices];
int vertBase = 0;
int indexBase = 0;
foreach (var entry in _entries.Values)
{
// Copy world-space vertices.
entry.Vertices.CopyTo(allVerts, vertBase);
// Rebase local indices (0..N-1) → absolute (vertBase..vertBase+N-1).
for (int i = 0; i < entry.Indices.Length; i++)
allIndices[indexBase + i] = (uint)(vertBase + entry.Indices[i]);
// Record where this landblock's indices live in the EBO (byte offset).
entry.EboByteOffset = (nint)(indexBase * sizeof(uint));
entry.IndexCount = entry.Indices.Length;
vertBase += entry.Vertices.Length;
indexBase += entry.Indices.Length;
}
// Upload to GPU.
_gl.BindVertexArray(_vao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _vbo);
fixed (void* p = allVerts)
_gl.BufferData(BufferTargetARB.ArrayBuffer,
(nuint)(totalVerts * sizeof(TerrainVertex)), p, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _ebo);
fixed (void* p = allIndices)
_gl.BufferData(BufferTargetARB.ElementArrayBuffer,
(nuint)(totalIndices * sizeof(uint)), p, BufferUsageARB.DynamicDraw);
_gl.BindVertexArray(0);
_gpuDirty = false;
}
// -------------------------------------------------------------------------
// Data types
// -------------------------------------------------------------------------
private sealed class LandblockEntry
{
public uint LandblockId;
public Vector3 WorldOrigin;
public TerrainVertex[] Vertices = Array.Empty<TerrainVertex>();
public uint[] Indices = Array.Empty<uint>();
public float MinZ;
public float MaxZ;
// Set by RebuildGpuBuffers:
public nint EboByteOffset;
public int IndexCount;
}
}

View file

@ -45,6 +45,15 @@ public sealed class BindlessSupport
_ext.MakeTextureHandleNonResident(handle);
}
// Phase N.5b note: a `SetSamplerHandleUniform` wrapper was added in T6
// and removed when terrain rendering surfaced GL_INVALID_OPERATION on
// NVIDIA Windows for the `uniform sampler2DArray` + glProgramUniformHandleARB
// combination. The replacement pattern (uvec2 handle uniform + GLSL
// sampler-from-handle constructor — see terrain_modern.frag) lives at the
// call site via plain `_gl.ProgramUniform2(program, loc, low, high)`. If
// you re-introduce a sampler-handle helper, restrict it to drivers known
// to accept the direct sampler-uniform path.
/// <summary>Detect <c>GL_ARB_shader_draw_parameters</c> in addition to bindless.
/// N.5's vertex shader uses <c>gl_BaseInstanceARB</c> and <c>gl_DrawIDARB</c>
/// from this extension.</summary>

View file

@ -0,0 +1,76 @@
using System;
using System.Collections.Generic;
namespace AcDream.Core.Terrain;
/// <summary>
/// Pure-CPU slot allocator for the terrain modern dispatcher's global VBO/EBO.
/// One slot = one landblock's worth of mesh data (384 verts + 384 indices).
/// Uses a FIFO free-list for slot recycling and a monotonic counter for
/// first-time growth, mirroring WorldBuilder's TerrainRenderManager pattern.
/// All bookkeeping is CPU-side; the GPU buffer growth itself is performed
/// by TerrainModernRenderer when <see cref="Allocate"/> sets needsGrow=true.
/// </summary>
public sealed class TerrainSlotAllocator
{
private readonly Queue<int> _freeSlots = new();
private readonly HashSet<int> _liveSlots = new();
private int _nextFreeSlot;
private int _capacity;
public TerrainSlotAllocator(int initialCapacity = 64)
{
if (initialCapacity <= 0)
throw new ArgumentOutOfRangeException(nameof(initialCapacity), "must be > 0");
_capacity = initialCapacity;
}
/// <summary>Current capacity in slots. Growable via <see cref="GrowTo"/>.</summary>
public int Capacity => _capacity;
/// <summary>Slots currently in use (allocated minus freed).</summary>
public int LoadedCount => _liveSlots.Count;
/// <summary>
/// Allocate a slot index. Reuses a freed slot via FIFO if available,
/// otherwise hands out the next monotonic index. Sets
/// <paramref name="needsGrow"/> to true when the returned slot index is
/// at or beyond current capacity — caller must <see cref="GrowTo"/>
/// before using the slot.
/// </summary>
public int Allocate(out bool needsGrow)
{
int slot;
if (_freeSlots.TryDequeue(out var freed))
{
slot = freed;
}
else
{
slot = _nextFreeSlot++;
}
_liveSlots.Add(slot);
needsGrow = slot >= _capacity;
return slot;
}
/// <summary>
/// Return a slot to the free list. Throws if the slot wasn't currently
/// allocated (catches double-free bugs).
/// </summary>
public void Free(int slot)
{
if (!_liveSlots.Remove(slot))
throw new InvalidOperationException(
$"Slot {slot} was not allocated (double-free or unknown slot).");
_freeSlots.Enqueue(slot);
}
/// <summary>Update capacity counter after the caller has grown the GPU buffers.</summary>
public void GrowTo(int newCapacity)
{
if (newCapacity < _capacity)
throw new ArgumentException("Capacity can only grow", nameof(newCapacity));
_capacity = newCapacity;
}
}

View file

@ -0,0 +1,168 @@
using AcDream.Core.Terrain;
using Xunit;
using Xunit.Abstractions;
using WbTerrainUtils = WorldBuilder.Shared.Modules.Landscape.Lib.TerrainUtils;
using WbCellSplitDirection = WorldBuilder.Shared.Modules.Landscape.Models.CellSplitDirection;
namespace AcDream.Core.Tests.Terrain;
/// <summary>
/// Phase N.5b data-collection test: quantifies how often WB's
/// <c>TerrainUtils.CalculateSplitDirection</c> disagrees with acdream's
/// <c>TerrainBlending.CalculateSplitDirection</c> (which retail uses
/// per <c>CLandBlockStruct::ConstructPolygons</c> at retail address
/// <c>00531d10</c>; named-retail decomp lines 316042-316144 contain
/// the exact constants <c>0x0CCAC033 / 0x6C1AC587 / 0x421BE3BD /
/// 0x519B8F25</c>).
///
/// Sweeps every (lbX, lbY, cellX, cellY) tuple in the world map
/// (255 x 255 landblocks x 64 cells = ~4.16M cells) and reports the
/// disagreement rate, per-landblock worst case, and a few named
/// representative landblocks. The number drives the Path A/B/C
/// decision in the N.5b brainstorm:
/// - Low disagreement &lt;5% : Path A's risk is bounded
/// - Medium 5-20% : Path B (fork-patch WB) preferred
/// - High &gt;20% : Path B/C strongly preferred
/// </summary>
public class SplitFormulaDivergenceTest
{
private readonly ITestOutputHelper _out;
public SplitFormulaDivergenceTest(ITestOutputHelper output) => _out = output;
[Fact]
public void Quantify_RetailVsWb_DivergenceRate()
{
// Two divergence flavors are tracked simultaneously:
//
// rawDisagree : retail-enum != wb-enum (pure formula output)
// diagonalDisagree: retail-actually-paints-diagonal !=
// wb-actually-paints-diagonal (effective geometry)
//
// The two differ because the enums are SEMANTICALLY INVERTED:
// - acdream `CellSplitDirection.SWtoNE` -> renderer paints BL->TR
// (SW-NE diagonal). Matches retail per AC2D Landblocks.cpp:400-412
// where FSplitNESW=true wraps a TRIANGLE_FAN [BL, BR, TR, TL] =
// diagonal BL-TR.
// - WB `CellSplitDirection.SWtoNE` -> WB's TerrainGeometryGenerator
// emits triangles {BL,TL,BR}+{BR,TL,TR} which share the BR-TL
// diagonal (SE-NW direction). The enum name is misleading; what
// WB actually draws is the OTHER diagonal.
//
// So the question "would WB's pipeline produce the same diagonals as
// retail's pipeline?" is answered by `diagonalDisagree`, not
// `rawDisagree`. If diagonalDisagree is near 0%, WB's formula +
// renderer happen to compose into a correct pipeline (despite the
// confusing labels). If diagonalDisagree is ~50%, the two pipelines
// truly diverge and Path A would visibly break terrain on every
// landblock.
const int lbCount = 255;
const int cellsPerSide = 8;
long totalCells = 0;
long rawDisagree = 0;
long diagonalDisagree = 0;
int worstLbDiag = 0;
uint worstLbX = 0, worstLbY = 0;
int bestLbDiag = 64;
uint bestLbX = 0, bestLbY = 0;
for (uint lbX = 0; lbX < lbCount; lbX++)
for (uint lbY = 0; lbY < lbCount; lbY++)
{
int lbDiagDisagree = 0;
for (uint cx = 0; cx < cellsPerSide; cx++)
for (uint cy = 0; cy < cellsPerSide; cy++)
{
bool retailEnumSWtoNE =
TerrainBlending.CalculateSplitDirection(lbX, cx, lbY, cy)
== CellSplitDirection.SWtoNE;
bool wbEnumSWtoNE =
WbTerrainUtils.CalculateSplitDirection(lbX, cx, lbY, cy)
== WbCellSplitDirection.SWtoNE;
// What diagonal each pipeline actually paints.
bool retailPaintsBLtoTR = retailEnumSWtoNE; // direct mapping
bool wbPaintsBLtoTR = !wbEnumSWtoNE; // inverted mapping
totalCells++;
if (retailEnumSWtoNE != wbEnumSWtoNE) rawDisagree++;
if (retailPaintsBLtoTR != wbPaintsBLtoTR)
{
diagonalDisagree++;
lbDiagDisagree++;
}
}
if (lbDiagDisagree > worstLbDiag)
{
worstLbDiag = lbDiagDisagree;
worstLbX = lbX;
worstLbY = lbY;
}
if (lbDiagDisagree < bestLbDiag)
{
bestLbDiag = lbDiagDisagree;
bestLbX = lbX;
bestLbY = lbY;
}
}
double rawPct = 100.0 * rawDisagree / totalCells;
double diagPct = 100.0 * diagonalDisagree / totalCells;
_out.WriteLine($"=== Phase N.5b — terrain split formula divergence ===");
_out.WriteLine($"Sweep: {lbCount}x{lbCount} landblocks, {cellsPerSide*cellsPerSide} cells each");
_out.WriteLine($"Total cells: {totalCells:N0}");
_out.WriteLine("");
_out.WriteLine($"RAW enum-output disagreement : {rawDisagree,12:N0} ({rawPct:F2}%)");
_out.WriteLine($" (compares retail-enum vs wb-enum, NOT what each system actually draws)");
_out.WriteLine("");
_out.WriteLine($"DIAGONAL-actually-painted disagreement: {diagonalDisagree,12:N0} ({diagPct:F2}%)");
_out.WriteLine($" (compares retail-paints-BL->TR vs wb-paints-BL->TR; this is the");
_out.WriteLine($" number that determines whether Path A visibly works)");
_out.WriteLine("");
_out.WriteLine($"Worst landblock (diagonal): 0x{worstLbX:X2}{worstLbY:X2} disagrees on {worstLbDiag}/64 cells ({100.0*worstLbDiag/64:F1}%)");
_out.WriteLine($"Best landblock (diagonal): 0x{bestLbX:X2}{bestLbY:X2} disagrees on {bestLbDiag}/64 cells ({100.0*bestLbDiag/64:F1}%)");
// Specific landblocks of interest (per N.5b handoff representative set).
var representative = new (string name, uint lbX, uint lbY)[]
{
("Holtburg town", 0xA9, 0xB0),
("Holtburg LB 0xA9B1", 0xA9, 0xB1),
("Foundry-area", 0x80, 0x80),
("Cragstone", 0xCB, 0x99),
("Direlands sample", 0xC0, 0x40),
("MapOrigin 0x0000", 0x00, 0x00),
("MapCorner 0xFEFE", 0xFE, 0xFE),
("Mid-map 0x7F7F", 0x7F, 0x7F),
("Subway dungeon LB 0x0185 outdoor part", 0x01, 0x85),
};
_out.WriteLine("");
_out.WriteLine("Representative landblocks (diagonal-actually-painted disagreement):");
foreach (var (name, lbX, lbY) in representative)
{
int dis = 0;
for (uint cx = 0; cx < 8; cx++)
for (uint cy = 0; cy < 8; cy++)
{
bool retailEnum = TerrainBlending.CalculateSplitDirection(lbX, cx, lbY, cy) == CellSplitDirection.SWtoNE;
bool wbEnum = WbTerrainUtils.CalculateSplitDirection(lbX, cx, lbY, cy) == WbCellSplitDirection.SWtoNE;
bool retailPaintsBLtoTR = retailEnum;
bool wbPaintsBLtoTR = !wbEnum;
if (retailPaintsBLtoTR != wbPaintsBLtoTR) dis++;
}
_out.WriteLine($" 0x{lbX:X2}{lbY:X2} {dis,2}/64 cells disagree ({100.0*dis/64:F1}%) {name}");
}
// Soft-floor on the DIAGONAL comparison: if diagPct is near 0% the
// formulas are equivalent post-inversion (Path A would just work
// visually; the only "bug" is enum naming). If diagPct is well
// above 0%, Path A truly breaks terrain.
// Soft-ceiling: an inversion of inversion shouldn't push past ~70%.
Assert.True(diagPct >= 0 && diagPct <= 100,
$"Sanity: diagonal disagreement out of range (rate={diagPct:F2}%)");
}
}

View file

@ -0,0 +1,189 @@
using System.Collections.Generic;
using System.IO;
using AcDream.Core.Physics;
using AcDream.Core.Terrain;
using DatReaderWriter;
using DatReaderWriter.DBObjs;
using DatReaderWriter.Options;
using Xunit;
using Xunit.Abstractions;
using Env = System.Environment;
namespace AcDream.Core.Tests.Terrain;
/// <summary>
/// Phase N.5b Z-conformance sentinel: proves that the visual terrain mesh
/// produced by <see cref="LandblockMesh.Build"/> agrees with the physics-side
/// <see cref="TerrainSurface.SampleZFromHeightmap"/> at arbitrary (X, Y)
/// within 1 mm. This is the exact bug class issue #51 names — if a future
/// refactor silently changes formula or vertex layout in either path,
/// this test fires before the player floats above (or sinks below) the
/// visible ground.
///
/// The test is dat-data-dependent. If <c>ACDREAM_DAT_DIR</c> isn't set or
/// the directory doesn't exist, the test logs a SKIP and passes — keeps CI
/// (no dat data) green while still firing locally on every developer run.
/// </summary>
public class TerrainModernConformanceTests
{
private readonly ITestOutputHelper _out;
public TerrainModernConformanceTests(ITestOutputHelper output) => _out = output;
private static readonly (string name, uint lbX, uint lbY)[] RepresentativeLandblocks =
{
("Holtburg flat 0xA9B0", 0xA9, 0xB0),
("Holtburg sloped 0xA9B1", 0xA9, 0xB1),
("Foundry-area 0x8080", 0x80, 0x80),
("Cragstone 0xCB99", 0xCB, 0x99),
("Direlands sample 0xC040", 0xC0, 0x40),
("MapOrigin 0x0000", 0x00, 0x00),
("Mid-map 0x7F7F", 0x7F, 0x7F),
("MapCorner 0xFEFE", 0xFE, 0xFE),
("Subway outdoor 0x0185", 0x01, 0x85),
("North continent 0x4D96", 0x4D, 0x96), // worst-case landblock from SplitFormulaDivergenceTest
};
[Fact]
public void VisualMeshZ_AgreesWith_PhysicsZ_WithinOneMillimeter()
{
var datDir = Env.GetEnvironmentVariable("ACDREAM_DAT_DIR")
?? Path.Combine(Env.GetFolderPath(Env.SpecialFolder.UserProfile),
"Documents", "Asheron's Call");
if (!Directory.Exists(datDir))
{
_out.WriteLine($"SKIP: dat directory not found at {datDir}");
return;
}
using var dats = new DatCollection(datDir, DatAccessType.Read);
var region = dats.Get<Region>(0x13000000u);
Assert.NotNull(region);
var heightTable = region.LandDefs.LandHeightTable;
Assert.NotNull(heightTable);
Assert.True(heightTable.Length >= 256, "heightTable must have at least 256 entries");
// Empty blending context — the conformance test only cares about
// vertex Z values, never the surface info / atlas layers. An empty
// dictionary + empty arrays are sufficient for BuildSurface to
// resolve every cell to a "base only" surface (the Z values come
// from the heightmap, not from the surface info).
var ctx = new TerrainBlendingContext(
TerrainTypeToLayer: new Dictionary<uint, byte>(),
RoadLayer: SurfaceInfo.None,
CornerAlphaLayers: Array.Empty<byte>(),
SideAlphaLayers: Array.Empty<byte>(),
RoadAlphaLayers: Array.Empty<byte>(),
CornerAlphaTCodes: Array.Empty<uint>(),
SideAlphaTCodes: Array.Empty<uint>(),
RoadAlphaRCodes: Array.Empty<uint>());
long totalSamples = 0;
long totalLandblocksTested = 0;
double maxDelta = 0;
(string name, uint lbX, uint lbY, float lx, float ly, float meshZ, float physicsZ) worstCase = default;
// Fixed seed for reproducible sample distribution. If a future change
// makes the test fire, the same (lx, ly) sequence reproduces the
// exact failing point on a follow-up run.
var rng = new Random(42);
foreach (var (name, lbX, lbY) in RepresentativeLandblocks)
{
uint landblockId = (lbX << 24) | (lbY << 16) | 0xFFFFu;
var landblock = dats.Get<LandBlock>(landblockId);
if (landblock is null)
{
_out.WriteLine($" skipped {name}: dat not found (probably water-only)");
continue;
}
totalLandblocksTested++;
var surfaceCache = new Dictionary<uint, SurfaceInfo>();
var meshData = LandblockMesh.Build(landblock, lbX, lbY, heightTable, ctx, surfaceCache);
// Sample 100 (localX, localY) points uniformly in [0, 191.975].
// The physics path clamps fx = localX/24 to (CellsPerSide - 0.001f)
// = 7.999, which corresponds to localX <= 7.999 * 24 = 191.976.
// Sampling beyond that boundary makes physics compute Z at the
// clamped position while the mesh sampler uses the actual
// position — a difference of up to 23 mm at the upper edge,
// which on a steep slope would falsely trip the 1 mm sentinel.
// Stay strictly below the clamp boundary so both oracles
// compute Z at the same (cellX, tx).
for (int s = 0; s < 100; s++)
{
float lx = (float)rng.NextDouble() * 191.975f;
float ly = (float)rng.NextDouble() * 191.975f;
float meshZ = SampleMeshZ(meshData, lx, ly);
float physicsZ = TerrainSurface.SampleZFromHeightmap(
landblock.Height, heightTable, lbX, lbY, lx, ly);
double delta = Math.Abs(meshZ - physicsZ);
if (delta > maxDelta)
{
maxDelta = delta;
worstCase = (name, lbX, lbY, lx, ly, meshZ, physicsZ);
}
totalSamples++;
Assert.True(delta < 0.001,
$"Mesh Z disagrees with physics Z at lb=0x{lbX:X2}{lbY:X2} ({name}) " +
$"local=({lx:F2},{ly:F2}): meshZ={meshZ:F4} physicsZ={physicsZ:F4} delta={delta:F4}m");
}
}
_out.WriteLine($"=== Phase N.5b conformance sweep ===");
_out.WriteLine($"Landblocks tested: {totalLandblocksTested}/{RepresentativeLandblocks.Length}");
_out.WriteLine($"Total samples: {totalSamples}");
_out.WriteLine($"Max |delta|: {maxDelta * 1000:F4} mm (tolerance: 1.0 mm)");
if (totalSamples > 0)
_out.WriteLine($"Worst case: {worstCase.name} local=({worstCase.lx:F2},{worstCase.ly:F2}) " +
$"meshZ={worstCase.meshZ:F4} physicsZ={worstCase.physicsZ:F4}");
Assert.True(totalLandblocksTested >= 5,
$"Expected at least 5 representative landblocks loadable; got {totalLandblocksTested}.");
}
/// <summary>
/// Sample the mesh's triangle-interpolated Z at (localX, localY). Walks
/// the mesh's triangles (3 indices each), tests point-in-triangle in 2D,
/// and barycentric-interpolates Z from the matching triangle's three Zs.
///
/// The mesh has 128 triangles per landblock (64 cells × 2). Every (lx, ly)
/// in [0, 192) lies in exactly one triangle (or on a shared edge — the
/// epsilon makes either side acceptable since they agree at the seam).
/// </summary>
private static float SampleMeshZ(LandblockMeshData mesh, float lx, float ly)
{
for (int triBase = 0; triBase < mesh.Indices.Length; triBase += 3)
{
var v0 = mesh.Vertices[mesh.Indices[triBase + 0]];
var v1 = mesh.Vertices[mesh.Indices[triBase + 1]];
var v2 = mesh.Vertices[mesh.Indices[triBase + 2]];
// Barycentric coords for (lx, ly) wrt triangle v0/v1/v2 in 2D.
float denom = (v1.Position.Y - v2.Position.Y) * (v0.Position.X - v2.Position.X)
+ (v2.Position.X - v1.Position.X) * (v0.Position.Y - v2.Position.Y);
if (Math.Abs(denom) < 1e-9f) continue;
float a = ((v1.Position.Y - v2.Position.Y) * (lx - v2.Position.X)
+ (v2.Position.X - v1.Position.X) * (ly - v2.Position.Y)) / denom;
float b = ((v2.Position.Y - v0.Position.Y) * (lx - v2.Position.X)
+ (v0.Position.X - v2.Position.X) * (ly - v2.Position.Y)) / denom;
float c = 1f - a - b;
// Inside test with epsilon for boundary stability — points that
// land exactly on a shared edge between two triangles still
// resolve, picking whichever the loop hits first (Z agrees on
// the seam either way).
const float eps = 1e-4f;
if (a >= -eps && b >= -eps && c >= -eps)
return a * v0.Position.Z + b * v1.Position.Z + c * v2.Position.Z;
}
// Should not happen for valid mesh + in-bounds (lx, ly).
throw new InvalidOperationException(
$"No triangle found containing local=({lx:F2},{ly:F2}); mesh has {mesh.Indices.Length / 3} triangles.");
}
}

View file

@ -0,0 +1,88 @@
using AcDream.Core.Terrain;
using Xunit;
namespace AcDream.Core.Tests.Terrain;
public class TerrainSlotAllocatorTests
{
[Fact]
public void Allocate_FromFreshAllocator_ReturnsZero()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 8);
Assert.Equal(0, alloc.Allocate(out _));
}
[Fact]
public void Allocate_TwoTimes_ReturnsZeroThenOne()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 8);
Assert.Equal(0, alloc.Allocate(out _));
Assert.Equal(1, alloc.Allocate(out _));
}
[Fact]
public void FreeThenAllocate_ReusesFreedSlot()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 8);
var s0 = alloc.Allocate(out _);
var s1 = alloc.Allocate(out _);
alloc.Free(s0);
Assert.Equal(s0, alloc.Allocate(out _));
}
[Fact]
public void FreeOrderedFreshAllocs_ReturnsInFifoOrder()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 8);
var s0 = alloc.Allocate(out _);
var s1 = alloc.Allocate(out _);
var s2 = alloc.Allocate(out _);
alloc.Free(s0);
alloc.Free(s2);
Assert.Equal(s0, alloc.Allocate(out _));
Assert.Equal(s2, alloc.Allocate(out _));
}
[Fact]
public void Allocate_BeyondInitialCapacity_SignalsNeedsGrow()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 2);
alloc.Allocate(out var grow0);
alloc.Allocate(out var grow1);
alloc.Allocate(out var grow2);
Assert.False(grow0);
Assert.False(grow1);
Assert.True(grow2);
}
[Fact]
public void GrowTo_DoublesCapacityCorrectly()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 4);
alloc.GrowTo(8);
Assert.Equal(8, alloc.Capacity);
alloc.GrowTo(64);
Assert.Equal(64, alloc.Capacity);
}
[Fact]
public void LoadedCount_TracksAllocAndFree()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 8);
Assert.Equal(0, alloc.LoadedCount);
var s0 = alloc.Allocate(out _);
var s1 = alloc.Allocate(out _);
Assert.Equal(2, alloc.LoadedCount);
alloc.Free(s0);
Assert.Equal(1, alloc.LoadedCount);
}
[Fact]
public void Free_TwiceForSameSlot_Throws()
{
var alloc = new TerrainSlotAllocator(initialCapacity: 8);
var s0 = alloc.Allocate(out _);
alloc.Free(s0);
Assert.Throws<InvalidOperationException>(() => alloc.Free(s0));
}
}