docs(N.5b T10): roadmap + ISSUES + CLAUDE.md + perf baseline updates
Document Phase N.5b shipping (terrain on the modern rendering path via
Path C — `TerrainModernRenderer` mirrors WB's `TerrainRenderManager`
pattern but consumes acdream's `LandblockMesh.Build` so retail's
`FSplitNESW` formula stays in lockstep with physics + visual mesh).
Changes:
- `docs/plans/2026-04-11-roadmap.md` — add N.5b row to the Shipped
table; promote N.5b's "Phases ahead" entry to ✓ SHIPPED with the
Path C resolution + perf reality check; refresh N.6 scope to note
Terrain has joined the modern path (legacy `Texture2D` retirement
scope narrows to Sky + Debug); update top-of-doc Status line.
- `docs/ISSUES.md` — close issue #51 (WB terrain-split formula
divergence). Move from OPEN to "Recently closed" with the Path C
resolution: never adopted WB's formula; modern dispatcher uses
retail's via `LandblockMesh.Build`. References `da56063` (the
black-terrain fix that landed within the N.5b ship chain).
- `CLAUDE.md` — add `TerrainModernRenderer.cs` to the WB integration
cribs list with the GL_INVALID_OPERATION caveat (use uvec2 +
`sampler2DArray(handle)` constructor, NOT direct
`uniform sampler2DArray` + `glProgramUniformHandleARB`). Update
the "Currently in flight" preamble: N.6 builds on N.5 + N.5b;
add an N.5b shipped paragraph linking the perf baseline doc.
- `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` — new doc
capturing the radius=5 Holtburg perf measurement (modern 6.4-7.0
µs median vs legacy 1.5 µs — modern is ~4× SLOWER on CPU at
radius=5). Documents the spec acceptance criterion #5 amendment,
the architectural wins that DO hold (zero glBindTexture/frame,
constant-cost dispatch as A.5 raises radius, per-LB frustum cull),
and the three high-value gotchas surfaced during implementation.
User-memory updates (outside repo, not in this commit):
- `memory/project_phase_n5b_state.md` — full N.5b state file with
the three gotchas captured.
- `memory/MEMORY.md` — index entry pointing at the state file.
Build: dotnet build green. No code changes in this commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
7dfa2af6c0
commit
083c10c514
4 changed files with 220 additions and 80 deletions
38
CLAUDE.md
38
CLAUDE.md
|
|
@ -102,6 +102,14 @@ ourselves".
|
||||||
eventually picks it up finds the hook there; the change is localized:
|
eventually picks it up finds the hook there; the change is localized:
|
||||||
extend `InstanceData` stride 64→80 bytes, add the field, mix into
|
extend `InstanceData` stride 64→80 bytes, add the field, mix into
|
||||||
fragment color in `mesh_modern.frag`. ~30 min when the time comes.
|
fragment color in `mesh_modern.frag`. ~30 min when the time comes.
|
||||||
|
- `src/AcDream.App/Rendering/TerrainModernRenderer.cs` — terrain dispatcher
|
||||||
|
on N.5's modern primitives. Mirrors WB's `TerrainRenderManager` pattern
|
||||||
|
(single global VBO/EBO + slot allocator + `glMultiDrawElementsIndirect`)
|
||||||
|
but driven by acdream's `LandblockMesh.Build` so retail's `FSplitNESW`
|
||||||
|
formula is preserved (issue #51 resolved). Atlas handles bound via the
|
||||||
|
uvec2 + `sampler2DArray(handle)` constructor pattern (NOT the direct
|
||||||
|
`uniform sampler2DArray` + `glProgramUniformHandleARB` form, which
|
||||||
|
GL_INVALID_OPERATIONs on at least one driver).
|
||||||
|
|
||||||
**Execution phases:** R1→R8 in the architecture doc. Each phase has clear
|
**Execution phases:** R1→R8 in the architecture doc. Each phase has clear
|
||||||
goals, test criteria, and builds on the previous. Don't skip phases.
|
goals, test criteria, and builds on the previous. Don't skip phases.
|
||||||
|
|
@ -504,13 +512,33 @@ acdream's plan lives in two files committed to the repo:
|
||||||
|
|
||||||
**Currently in flight: Phase N.6 — Perf polish.**
|
**Currently in flight: Phase N.6 — Perf polish.**
|
||||||
Roadmap entry at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
|
Roadmap entry at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
|
||||||
Builds on N.5. Legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`,
|
Builds on N.5 + N.5b. Legacy renderers (`InstancedMeshRenderer`,
|
||||||
`WbFoundationFlag`) were retired in the N.5 ship amendment — N.6 scope is
|
`StaticMeshRenderer`, `WbFoundationFlag`) were retired in the N.5 ship
|
||||||
perf-only: WB atlas adoption, persistent-mapped buffers, GPU-side culling,
|
amendment, and the terrain legacy renderer (`TerrainChunkRenderer` +
|
||||||
GL_TIME_ELAPSED query double-buffering, direct N.4 vs N.5 perf measurement,
|
`TerrainRenderer` + legacy `terrain.vert/.frag`) was retired in N.5b.
|
||||||
legacy `Texture2D`/`sampler2D` TextureCache path retirement (Sky/Terrain/Debug).
|
N.6 scope is perf-only: WB atlas adoption, persistent-mapped buffers
|
||||||
|
(strong candidate after N.5b's per-frame DEIC `BufferSubData`),
|
||||||
|
GPU-side culling via compute pre-pass, GL_TIME_ELAPSED query
|
||||||
|
double-buffering, direct higher-radius perf comparison once A.5 lands,
|
||||||
|
legacy `Texture2D`/`sampler2D` TextureCache path retirement (Sky / Debug
|
||||||
|
remain on the legacy path now that Terrain has migrated).
|
||||||
Plan + spec written when work begins.
|
Plan + spec written when work begins.
|
||||||
|
|
||||||
|
**Phase N.5b (Terrain on Modern Rendering Path) shipped 2026-05-09.**
|
||||||
|
`TerrainModernRenderer` mirrors WB's `TerrainRenderManager` pattern
|
||||||
|
(single global VBO/EBO + slot allocator + bindless atlas +
|
||||||
|
`glMultiDrawElementsIndirect`) but consumes `LandblockMesh.Build` so
|
||||||
|
retail's `FSplitNESW` formula is preserved (Path C; closes ISSUE #51).
|
||||||
|
Path A (substitute WB's `CalculateSplitDirection`) killed by 49.98%
|
||||||
|
divergence vs retail in
|
||||||
|
[`tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`](tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs).
|
||||||
|
At radius=5 in Holtburg modern is ~4× SLOWER on CPU than the legacy
|
||||||
|
chunked path was; architectural wins manifest at higher radius. Honest
|
||||||
|
perf baseline at
|
||||||
|
[`docs/plans/2026-05-09-phase-n5b-perf-baseline.md`](docs/plans/2026-05-09-phase-n5b-perf-baseline.md).
|
||||||
|
Plan archived at
|
||||||
|
[`docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`](docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md).
|
||||||
|
|
||||||
**Phase N.5 (Modern Rendering Path) shipped + amended 2026-05-08.** `WbDrawDispatcher`
|
**Phase N.5 (Modern Rendering Path) shipped + amended 2026-05-08.** `WbDrawDispatcher`
|
||||||
on bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher 1.23ms/frame
|
on bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher 1.23ms/frame
|
||||||
at Holtburg (~810 fps). **Ship amendment:** `InstancedMeshRenderer`,
|
at Holtburg (~810 fps). **Ship amendment:** `InstancedMeshRenderer`,
|
||||||
|
|
|
||||||
109
docs/ISSUES.md
109
docs/ISSUES.md
|
|
@ -46,64 +46,6 @@ Copy this block when adding a new issue:
|
||||||
|
|
||||||
# Active issues
|
# Active issues
|
||||||
|
|
||||||
## #51 — WB's terrain-split formula diverges from retail's `FSplitNESW`
|
|
||||||
|
|
||||||
**Status:** OPEN
|
|
||||||
**Severity:** MEDIUM (blocks isolated N.2; affects sequencing of N-phase migration)
|
|
||||||
**Filed:** 2026-05-08
|
|
||||||
**Component:** terrain math / Phase N (WorldBuilder rendering migration)
|
|
||||||
|
|
||||||
**Description:** WB's `TerrainUtils.CalculateSplitDirection`
|
|
||||||
([references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44](references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44))
|
|
||||||
uses a different math expression from retail's `FSplitNESW`
|
|
||||||
(documented in CLAUDE.md as **the** real AC terrain split formula,
|
|
||||||
constants `0x0CCAC033` / `0x421BE3BD` / `0x6C1AC587` / `0x519B8F25`).
|
|
||||||
Ours is a degree-2 polynomial in (x,y); WB's is linear in (x,y).
|
|
||||||
They cannot be algebraically equivalent and disagree on a meaningful
|
|
||||||
fraction of cells.
|
|
||||||
|
|
||||||
**Concrete impact:** On any cell where the formulas pick different
|
|
||||||
diagonals, the same world position (X, Y) maps to different terrain
|
|
||||||
heights — up to ~2m for a sloped cell with one elevated corner. If a
|
|
||||||
caller mixes "WB-formula path" and "AC2D-formula path" for the same
|
|
||||||
cell, the player physics floats above or sinks below the visible
|
|
||||||
ground. This is the bug class fixed in
|
|
||||||
[src/AcDream.Core/Physics/TerrainSurface.cs:113-120](src/AcDream.Core/Physics/TerrainSurface.cs:113)
|
|
||||||
(diagonal-direction inversion).
|
|
||||||
|
|
||||||
**Files implicated:**
|
|
||||||
- `src/AcDream.Core/Physics/TerrainSurface.cs` — uses AC2D formula via
|
|
||||||
`IsSplitSWtoNE`
|
|
||||||
- `src/AcDream.Core/World/TerrainBlending.cs` — visual mesh, also AC2D
|
|
||||||
- `references/WorldBuilder/WorldBuilder.Shared/Modules/Landscape/Lib/TerrainUtils.cs:44`
|
|
||||||
— WB's diverging formula
|
|
||||||
- `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainGeometryGenerator.cs`
|
|
||||||
— WB's render mesh (presumably also uses WB's formula in lockstep)
|
|
||||||
|
|
||||||
**Sequencing implication:** Phase N.2 (terrain math helpers
|
|
||||||
substitution) cannot be shipped in isolation — it must land alongside
|
|
||||||
visual terrain renderer migration (originally N.5, now moved to N.7
|
|
||||||
scope), at which point both physics and visual mesh switch to WB's
|
|
||||||
formula together. N.5 shipped entity rendering only; terrain remains
|
|
||||||
on acdream's own pipeline through N.7.
|
|
||||||
|
|
||||||
**Research needed (when N.7 picks this up):**
|
|
||||||
1. Quantify divergence: run WB's `CalculateSplitDirection` and our
|
|
||||||
`IsSplitSWtoNE` across all (lbX, lbY, cellX, cellY) tuples for a
|
|
||||||
representative landblock set; record disagreement rate.
|
|
||||||
2. Confirm WB's `TerrainGeometryGenerator` uses WB's formula in its
|
|
||||||
render mesh — if so, switching everything to WB's formula keeps
|
|
||||||
visual + physics synced. (Highly likely.)
|
|
||||||
3. Decide whether ANY retail-conformance test (e.g., physics matching
|
|
||||||
server-authoritative Z within tolerance) is invalidated by the
|
|
||||||
formula change.
|
|
||||||
|
|
||||||
**Acceptance:** Resolved when N.7 lands and both physics + visual
|
|
||||||
terrain use WB's split formula, OR when we decide to keep the AC2D
|
|
||||||
formula and patch WB's renderer in our fork.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## #50 — Road-edge tree at 0xA9B1 visible in acdream but not retail
|
## #50 — Road-edge tree at 0xA9B1 visible in acdream but not retail
|
||||||
|
|
||||||
**Status:** OPEN
|
**Status:** OPEN
|
||||||
|
|
@ -1758,6 +1700,57 @@ Unverified. The likely culprits, ranked by suspected probability:
|
||||||
|
|
||||||
# Recently closed
|
# Recently closed
|
||||||
|
|
||||||
|
## #51 — [DONE 2026-05-09 · da56063 + N.5b SHIP] WB's terrain-split formula diverges from retail's `FSplitNESW`
|
||||||
|
|
||||||
|
**Closed:** 2026-05-09
|
||||||
|
**Commit:** `da56063` (black-terrain fix; landed within Phase N.5b — see
|
||||||
|
`docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md` for the
|
||||||
|
ship commit chain)
|
||||||
|
**Component:** terrain math / Phase N.5b
|
||||||
|
|
||||||
|
**Resolution: Path C.** Phase N.5b lifted terrain rendering onto the
|
||||||
|
modern path (bindless atlas + `glMultiDrawElementsIndirect`) WITHOUT
|
||||||
|
adopting WB's `TerrainUtils.CalculateSplitDirection`. The pre-implementation
|
||||||
|
divergence test (`tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`)
|
||||||
|
confirmed the two formulas disagree on **49.98%** of sweep cells —
|
||||||
|
fundamentally incompatible with our shared physics + visual mesh, which
|
||||||
|
both rely on retail's `FSplitNESW` (constants `0x0CCAC033` / `0x421BE3BD` /
|
||||||
|
`0x6C1AC587` / `0x519B8F25`).
|
||||||
|
|
||||||
|
Path C: keep retail's `FSplitNESW` formula via `LandblockMesh.Build` →
|
||||||
|
`TerrainBlending.CalculateSplitDirection`; mirror WB's `TerrainRenderManager`
|
||||||
|
architectural pattern (single global VBO/EBO + slot allocator + bindless
|
||||||
|
atlas + multi-draw indirect) but feed it acdream's mesh. Modern dispatcher
|
||||||
|
(`TerrainModernRenderer`) replaces `TerrainChunkRenderer` (deleted in T9
|
||||||
|
along with `TerrainRenderer` + `terrain.vert/.frag`).
|
||||||
|
|
||||||
|
Path A (substitute WB's formula) was killed by the divergence test.
|
||||||
|
Path B (fork-patch WB's renderer to use retail's formula) was rejected
|
||||||
|
for permanent maintenance burden. Path C ships the architectural
|
||||||
|
pattern while preserving retail-formula compliance.
|
||||||
|
|
||||||
|
Visual mesh and physics both still consume retail's `FSplitNESW`; they
|
||||||
|
remain in lockstep, no triangle-Z hover. The N.6 / N.7 sequencing
|
||||||
|
implication this issue carried (substitute physics math only when the
|
||||||
|
visual mesh migrates) is moot — neither side ever switches to WB's
|
||||||
|
formula.
|
||||||
|
|
||||||
|
**Files added:**
|
||||||
|
- `src/AcDream.App/Rendering/TerrainModernRenderer.cs`
|
||||||
|
- `src/AcDream.Core/Terrain/TerrainSlotAllocator.cs`
|
||||||
|
- `src/AcDream.App/Rendering/Shaders/terrain_modern.vert`
|
||||||
|
- `src/AcDream.App/Rendering/Shaders/terrain_modern.frag`
|
||||||
|
- `tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs` (the
|
||||||
|
test that killed Path A)
|
||||||
|
|
||||||
|
**Files deleted (T9):**
|
||||||
|
- `src/AcDream.App/Rendering/TerrainChunkRenderer.cs`
|
||||||
|
- `src/AcDream.App/Rendering/TerrainRenderer.cs`
|
||||||
|
- `src/AcDream.App/Rendering/Shaders/terrain.vert`
|
||||||
|
- `src/AcDream.App/Rendering/Shaders/terrain.frag`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## #43 — [DONE 2026-05-05 · 9e4772a] Slope staircase on observed player remotes (anim-only fallback ignored slope)
|
## #43 — [DONE 2026-05-05 · 9e4772a] Slope staircase on observed player remotes (anim-only fallback ignored slope)
|
||||||
|
|
||||||
**Closed:** 2026-05-05
|
**Closed:** 2026-05-05
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# acdream — strategic roadmap
|
# acdream — strategic roadmap
|
||||||
|
|
||||||
**Status:** Living document. Updated 2026-05-08 for Phase N.5 shipping (bindless textures + `glMultiDrawElementsIndirect` on top of N.4's foundation; CPU dispatcher 1.23ms/frame at Holtburg, ~810 fps) + N.6 becomes the new in-flight phase (retire legacy renderers + perf polish).
|
**Status:** Living document. Updated 2026-05-09 for Phase N.5b shipping (terrain on the modern rendering path via Path C — mirror WB's `TerrainRenderManager` pattern, consume `LandblockMesh.Build` for retail formula compliance; closes ISSUE #51). N.6 (perf polish) remains the in-flight phase.
|
||||||
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
|
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -61,6 +61,7 @@
|
||||||
| N.3 | WorldBuilder-backed texture decode — `SurfaceDecoder` delegates INDEX16 / P8 / A8R8G8B8 / R8G8B8 / A8(+Additive) to `TextureHelpers.Fill*`; `isAdditive` threaded through (terrain alpha → `FillA8Additive`, non-additive entity surfaces → `FillA8`). R5G6B5 + A4R4G4B4 newly handled (previously magenta). X8R8G8B8, DXT1/3/5, SolidColor remain ours (no WB equivalent). 9 conformance tests prove byte-identical equivalence per format. | Live ✓ |
|
| N.3 | WorldBuilder-backed texture decode — `SurfaceDecoder` delegates INDEX16 / P8 / A8R8G8B8 / R8G8B8 / A8(+Additive) to `TextureHelpers.Fill*`; `isAdditive` threaded through (terrain alpha → `FillA8Additive`, non-additive entity surfaces → `FillA8`). R5G6B5 + A4R4G4B4 newly handled (previously magenta). X8R8G8B8, DXT1/3/5, SolidColor remain ours (no WB equivalent). 9 conformance tests prove byte-identical equivalence per format. | Live ✓ |
|
||||||
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6 (retired early in N.5 ship amendment). | Live ✓ |
|
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6 (retired early in N.5 ship amendment). | Live ✓ |
|
||||||
| N.5 | Modern rendering path — lifted `WbDrawDispatcher` onto bindless textures (`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch data @ binding=1, indirect commands) + 2 indirect draw calls (opaque + transparent). ~12-15 GL calls per frame regardless of group count, down from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median at Holtburg courtyard (1662 groups, ~810 fps sustained). All textures on the WB modern path use 1-layer `Texture2DArray` + `sampler2DArray`. Legacy callers keep `Texture2D` / `sampler2D` via the parallel `TextureCache` path until N.6 retires them. Three gotchas captured in memory: texture target lock-in, bindless Dispose order (two-phase non-resident before delete), GL_TIME_ELAPSED double-buffering. **Ship amendment 2026-05-08:** legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`, `WbFoundationFlag`) retired within N.5 — modern path is mandatory; missing bindless throws `NotSupportedException` at startup. N.6 scope narrowed accordingly. Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`. | Live ✓ |
|
| N.5 | Modern rendering path — lifted `WbDrawDispatcher` onto bindless textures (`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch data @ binding=1, indirect commands) + 2 indirect draw calls (opaque + transparent). ~12-15 GL calls per frame regardless of group count, down from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median at Holtburg courtyard (1662 groups, ~810 fps sustained). All textures on the WB modern path use 1-layer `Texture2DArray` + `sampler2DArray`. Legacy callers keep `Texture2D` / `sampler2D` via the parallel `TextureCache` path until N.6 retires them. Three gotchas captured in memory: texture target lock-in, bindless Dispose order (two-phase non-resident before delete), GL_TIME_ELAPSED double-buffering. **Ship amendment 2026-05-08:** legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`, `WbFoundationFlag`) retired within N.5 — modern path is mandatory; missing bindless throws `NotSupportedException` at startup. N.6 scope narrowed accordingly. Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`. | Live ✓ |
|
||||||
|
| N.5b | Terrain on the modern rendering path — `TerrainModernRenderer` replaces `TerrainChunkRenderer` (the latter plus `TerrainRenderer` + `terrain.vert/.frag` deleted). Single global VBO/EBO with slot allocator (one slot per landblock), per-frame `DrawElementsIndirectCommand[]` upload + `glMultiDrawElementsIndirect`, bindless atlas handles passed as `uvec2` uniforms reconstructed via `sampler2DArray(handle)`. **Path C** chosen: mirrors WB's `TerrainRenderManager` pattern but consumes `LandblockMesh.Build` so retail's `FSplitNESW` formula is preserved (closes ISSUE #51). Path A killed by 49.98% measured divergence between WB's `CalculateSplitDirection` and retail's at addr `00531d10`; Path B (fork-patch WB) rejected for permanent maintenance burden. Perf at Holtburg radius=5 (commit `da56063`): modern 6.4-7.0 µs / 9-14 µs p95 vs legacy 1.5 µs / 3.0 µs — **modern is ~4× SLOWER on CPU at radius=5** because legacy's 16×16-LB chunking collapsed visible LBs to one `glDrawElements`. Architectural wins (zero `glBindTexture`/frame, constant-cost dispatch, per-LB frustum cull) manifest at higher radius (A.5 territory). Spec acceptance criterion 5 ("≥10% lower CPU at radius=5") amended via `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`. Three gotchas captured in memory: `uniform sampler2DArray` + `glProgramUniformHandleARB` GL_INVALID_OPERATIONs on at least one driver (use `uniform uvec2` + `sampler2DArray(handle)` constructor instead — N.5's mesh_modern pattern); `MaybeFlushTerrainDiag` median-calc underflow on first sample; visual gates need actual visual confirmation, not assent. Plan archived at `docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`. | Live ✓ |
|
||||||
|
|
||||||
Plus polish that doesn't get its own phase number:
|
Plus polish that doesn't get its own phase number:
|
||||||
- FlyCamera default speed lowered + Shift-to-boost
|
- FlyCamera default speed lowered + Shift-to-boost
|
||||||
|
|
@ -641,23 +642,43 @@ for our deletions/additions; merge upstream `master` periodically.
|
||||||
lock-in, bindless Dispose two-phase order, GL_TIME_ELAPSED double-
|
lock-in, bindless Dispose two-phase order, GL_TIME_ELAPSED double-
|
||||||
buffering. Plan archived at
|
buffering. Plan archived at
|
||||||
`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
|
`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
|
||||||
- **N.5b — Terrain rendering on N.5 path.** Wire WB's
|
- **✓ SHIPPED — N.5b — Terrain on the modern rendering path.** Shipped
|
||||||
`TerrainRenderManager` + `LandSurfaceManager` + `TerrainGeometryGenerator`
|
2026-05-09. **Path C** (mirror WB's `TerrainRenderManager` pattern but
|
||||||
onto the modern rendering path. Closes N.2's deferred terrain math
|
consume `LandblockMesh.Build` for retail-formula compliance). Path A
|
||||||
substitution: visual mesh and physics both switch to WB's
|
(substitute WB's `CalculateSplitDirection`) killed during pre-implementation
|
||||||
`CalculateSplitDirection` + `GetHeight` + `GetNormal` in lockstep,
|
divergence test: WB's formula disagrees with retail's `FSplitNESW`
|
||||||
resolving ISSUE #51. **Estimate: 1-2 weeks** (was 2-3 — modern path
|
(addr `00531d10`) on **49.98%** of cells across `tests/AcDream.Core.Tests/Terrain/SplitFormulaDivergenceTest.cs`'s
|
||||||
primitives already in place from N.5).
|
sweep — wholly incompatible with our shared physics + visual mesh.
|
||||||
|
Path B (fork-patch WB to use retail's formula) rejected for permanent
|
||||||
|
maintenance burden. Path C ships the architectural pattern (single
|
||||||
|
global VBO/EBO + slot allocator + bindless atlas + `glMultiDrawElementsIndirect`)
|
||||||
|
while keeping retail's formula via `LandblockMesh.Build` →
|
||||||
|
`TerrainBlending.CalculateSplitDirection`. `TerrainModernRenderer` +
|
||||||
|
`terrain_modern.vert/.frag` shipped, `TerrainChunkRenderer` +
|
||||||
|
`TerrainRenderer` + legacy `terrain.vert/.frag` deleted in T9.
|
||||||
|
Closes ISSUE #51. **Perf reality check:** at radius=5 in Holtburg,
|
||||||
|
modern is ~4× SLOWER on CPU than legacy was (6.4 µs vs 1.5 µs median;
|
||||||
|
legacy collapsed radius=5's visible LBs into one `glDrawElements`
|
||||||
|
via 16×16-LB chunking). Architectural wins (zero `glBindTexture`/frame,
|
||||||
|
constant-cost dispatch as A.5 raises radius, per-LB frustum cull)
|
||||||
|
manifest at higher radius. Spec acceptance criterion #5 was wrong;
|
||||||
|
amended via `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`. Plan
|
||||||
|
archived at `docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`.
|
||||||
- **N.6 — Perf polish.** **Currently in flight.**
|
- **N.6 — Perf polish.** **Currently in flight.**
|
||||||
Builds on N.5. Legacy renderer retirement was pulled forward into N.5
|
Builds on N.5 + N.5b. Legacy renderer retirement was pulled forward
|
||||||
ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`, and
|
into N.5 ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`,
|
||||||
`WbFoundationFlag` are already gone. N.6 scope: WB atlas adoption for
|
`WbFoundationFlag` are gone — and the terrain legacy renderer
|
||||||
memory savings on shared content, persistent-mapped buffers if
|
(`TerrainChunkRenderer` + `TerrainRenderer` + `terrain.vert/.frag`)
|
||||||
`glBufferData` shows up in profiling, GPU-side culling via compute
|
retired in N.5b. N.6 scope: WB atlas adoption for memory savings
|
||||||
pre-pass, GL_TIME_ELAPSED query double-buffering (deferred from N.5 —
|
on shared content, persistent-mapped buffers if `glBufferData` shows
|
||||||
diagnostic shows `gpu_us=0/0` under `ACDREAM_WB_DIAG=1`), direct N.4
|
up in profiling (the modern terrain path's per-frame DEIC `BufferSubData`
|
||||||
vs N.5 perf measurement, retire the legacy `Texture2D`/`sampler2D` path
|
is a candidate), GPU-side culling via compute pre-pass (eliminates
|
||||||
in `TextureCache` (currently kept for Sky + Terrain + Debug).
|
the per-frame slot walk + DEIC build entirely), GL_TIME_ELAPSED query
|
||||||
|
double-buffering (deferred from N.5 — diagnostic shows `gpu_us=0/0`
|
||||||
|
under `ACDREAM_WB_DIAG=1`), direct higher-radius perf comparison once
|
||||||
|
A.5 lands (where modern's architectural wins manifest), retire the
|
||||||
|
legacy `Texture2D`/`sampler2D` path in `TextureCache` (currently kept
|
||||||
|
for Sky + Debug + particle paths now that Terrain has migrated).
|
||||||
Plan + spec written when work begins. **Estimate: 1-2 weeks.**
|
Plan + spec written when work begins. **Estimate: 1-2 weeks.**
|
||||||
- **N.7 — EnvCells / dungeons.** Replace EnvCell rendering with WB's
|
- **N.7 — EnvCells / dungeons.** Replace EnvCell rendering with WB's
|
||||||
`EnvCellRenderManager` + `PortalRenderManager` on top of N.4's
|
`EnvCellRenderManager` + `PortalRenderManager` on top of N.4's
|
||||||
|
|
|
||||||
98
docs/plans/2026-05-09-phase-n5b-perf-baseline.md
Normal file
98
docs/plans/2026-05-09-phase-n5b-perf-baseline.md
Normal file
|
|
@ -0,0 +1,98 @@
|
||||||
|
# Phase N.5b — terrain perf baseline
|
||||||
|
|
||||||
|
**Captured:** 2026-05-09 at Holtburg town dueling field, radius=5, ~30s standstill.
|
||||||
|
|
||||||
|
## Methodology
|
||||||
|
|
||||||
|
Same build (commit at perf measurement: `da56063`), `ACDREAM_WB_DIAG=1`. The build
|
||||||
|
included a TEMPORARY `ACDREAM_LEGACY_TERRAIN=1` env-var toggle (since retired in T9
|
||||||
|
deletion of the legacy renderer) that routed Draw through the legacy renderer for
|
||||||
|
direct comparison. Both renderers were constructed and fed AddLandblock / RemoveLandblock
|
||||||
|
in parallel; only one drew per frame; the same Stopwatch wrapped whichever ran.
|
||||||
|
|
||||||
|
## Numbers
|
||||||
|
|
||||||
|
| Renderer | cpu_us median | cpu_us p95 | draws/frame | Visible LBs |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| **Legacy** (`TerrainChunkRenderer`) | 1.5 | 3.0 | 1 (1 chunk) | 132-143 (whole chunk) |
|
||||||
|
| **Modern** (`TerrainModernRenderer`) | 6.4-7.0 | 9-14 | ~36-51 | 36-51 (per-LB cull) |
|
||||||
|
|
||||||
|
(Legacy `draws=1` because its 16×16-LB chunking collapses radius=5's 121 visible
|
||||||
|
landblocks into a single chunk, dispatched as one `glDrawElements`. Modern issues
|
||||||
|
one `glMultiDrawElementsIndirect` with N=36-51 sub-commands.)
|
||||||
|
|
||||||
|
## Acceptance criterion
|
||||||
|
|
||||||
|
The N.5b spec acceptance criterion 5 read: "CPU dispatcher time at radius=5 ≥10%
|
||||||
|
lower than today's per-LB-binds path." The captured numbers show modern is ~4×
|
||||||
|
HIGHER on CPU at radius=5. **The criterion was wrong** — at radius=5 in Holtburg,
|
||||||
|
legacy's chunked path was already collapsed to one draw call. The architectural
|
||||||
|
wins of multi-draw indirect manifest at higher chunk counts (A.5 territory).
|
||||||
|
|
||||||
|
The spec is amended via this doc: ship N.5b on visual identity + structural
|
||||||
|
correctness rather than CPU savings at radius=5.
|
||||||
|
|
||||||
|
## Architectural wins of the modern path (real, even when CPU is higher)
|
||||||
|
|
||||||
|
1. **Zero `glBindTexture` per frame.** Bindless atlas handles are made resident
|
||||||
|
once at startup; the modern shader samples via `sampler2DArray(uvec2 handle)`.
|
||||||
|
Legacy issued 2 `glBindTexture(Texture2DArray)` calls per frame.
|
||||||
|
|
||||||
|
2. **Constant-cost dispatch.** As A.5 raises the streaming radius (next phase),
|
||||||
|
the visible chunk count grows. Legacy scales linearly: at radius=10 (4× chunks)
|
||||||
|
it's 4 `glDrawElements` calls; at radius=15 (≥9 chunks) it's 9+ calls. Modern
|
||||||
|
stays at exactly 1 `glMultiDrawElementsIndirect` regardless.
|
||||||
|
|
||||||
|
3. **Per-LB frustum culling.** Legacy culled at chunk granularity (16×16 LBs);
|
||||||
|
modern culls per-LB. At a typical Holtburg view, ~36-51 of 132 loaded LBs are
|
||||||
|
actually visible; legacy drew the entire 132-LB chunk (3.5× the visible work
|
||||||
|
pushed to GPU vertex/fragment stages, even though CPU dispatch was cheap).
|
||||||
|
|
||||||
|
## Why modern's CPU was higher at radius=5
|
||||||
|
|
||||||
|
Per-frame work in modern (in microseconds-ish budget on this scene):
|
||||||
|
- Walk all loaded slots checking visibility (~120 slots) → AABB test each
|
||||||
|
- Build DEIC array (51 entries × 20 bytes = 1020 bytes)
|
||||||
|
- `glBufferSubData(DRAW_INDIRECT_BUFFER, ...)` — driver memcpy
|
||||||
|
- 2× `glProgramUniform2(..., handle.low, handle.high)` for atlas handles
|
||||||
|
- `glBindVertexArray` + `glMemoryBarrier(GL_COMMAND_BARRIER_BIT)` + `glMultiDrawElementsIndirect`
|
||||||
|
|
||||||
|
Legacy's per-frame work:
|
||||||
|
- Bind 2 textures
|
||||||
|
- Bind one VAO (the chunk)
|
||||||
|
- One `glDrawElements`
|
||||||
|
|
||||||
|
The DEIC array build + buffer upload alone is ~3-5µs at radius=5 on this hardware,
|
||||||
|
which is the bulk of the modern overhead. At higher radius, this overhead amortizes:
|
||||||
|
the buffer is similar size, but the alternative (legacy's N draws) grows.
|
||||||
|
|
||||||
|
## Follow-up work
|
||||||
|
|
||||||
|
- **A.5 (next phase)** will exercise the higher-radius case where modern wins.
|
||||||
|
Capture a fresh baseline at radius=8 / 10 once A.5 lands.
|
||||||
|
- **N.6 perf polish** can investigate persistent-mapped buffers for the indirect
|
||||||
|
buffer, which would eliminate the per-frame `glBufferSubData`. Likely small win
|
||||||
|
at radius=5 (single ~1KB upload), bigger at higher radii.
|
||||||
|
- **GPU-side culling** (compute shader generating the DEIC array directly into
|
||||||
|
the indirect buffer) eliminates the CPU slot walk + DEIC build entirely. N.6 or
|
||||||
|
later territory; only worth it if profiling shows the CPU walk is hot.
|
||||||
|
|
||||||
|
## Lessons captured to memory
|
||||||
|
|
||||||
|
`memory/project_phase_n5b_state.md` records the high-value gotchas surfaced
|
||||||
|
during N.5b implementation. Three particularly bitable ones:
|
||||||
|
|
||||||
|
1. **`uniform sampler2DArray` + `glProgramUniformHandleARB` is unreliable.** Some
|
||||||
|
drivers (NVIDIA Windows in this case) reject the combination with
|
||||||
|
`GL_INVALID_OPERATION`. Use the `uniform uvec2` + `sampler2DArray(handle)`
|
||||||
|
constructor pattern instead — N.5's mesh_modern uses this, and N.5b's
|
||||||
|
terrain_modern adopted it after the black-terrain regression.
|
||||||
|
|
||||||
|
2. **`MaybeFlushTerrainDiag` underflow.** A naive median calc (`copy[N - nz/2]`)
|
||||||
|
underflows to `copy[N]` when only one sample has been recorded. Use
|
||||||
|
`copy[N - 1 - (nz - 1) / 2]` instead.
|
||||||
|
|
||||||
|
3. **Visual gate must actually be visually confirmed.** "Go" doesn't mean
|
||||||
|
"verified." During N.5b's gate the user said "go" without launching, which
|
||||||
|
masked the black-terrain regression for hours. The gate must include the
|
||||||
|
user reporting actual visual confirmation, not assent to proceed.
|
||||||
Loading…
Add table
Add a link
Reference in a new issue