Commit graph

706 commits

Author SHA1 Message Date
Erik
1658882439 fix(A.5 T4-T6): bootstrap guard + dead enum + test cleanups
Code review on commits 7bcabab/fb6b61e/326b698 flagged 2 Important +
4 Minor issues. Apply all fixes:

Important:
- Two-tier RecenterTo + MarkResidentFromBootstrap now throw
  InvalidOperationException on misuse — calling RecenterTo before the
  bootstrap silently emitted the entire window as fresh loads (no
  demotes/unloads since _tierResidence was empty), a correctness hazard
  that produced no exception. Calling MarkResidentFromBootstrap twice
  silently dropped accumulated tier state. Both now crash loudly via
  a _bootstrapped flag.
- Dropped TierResidence.None from the enum — never assigned, never
  checked; absence from the dictionary already encodes "not resident."

Minor:
- Renamed test: RecenterTo_FirstTick_* → ComputeFirstTickDiff_FirstTick_*
  (the test calls ComputeFirstTickDiff, not RecenterTo).
- Strengthened RecenterTo_PlayerWalks_NullToFar_* with assertions for
  ToPromote.Count==3 (the x=102 column promoting Far→Near) and
  ToUnload.Empty (everything within hysteresis).
- Replaced System.Math.Abs with Math.Abs in new code to match the
  file's existing `using System;` convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:49:35 +02:00
Erik
326b698161 test(A.5 T6): StreamingRegion transitions + hysteresis + oscillation coverage
Adds 5 tests to StreamingRegionTwoTierTests covering all tier-transition paths:
- FarToNear promote (walk 2 east from initial center)
- NullToNear teleport (loads 9 near + 40 far for a fully fresh region)
- NearToFar demote only after NearRadius+2 hysteresis threshold
- FarToNull unload only after FarRadius+2 hysteresis threshold
- oscillation no-thrash: bouncing 1 LB across a near boundary fires 0 demotes
  and at most 5 promotes total (one initial settle of the x=100 near-column)

Oscillation test fix: initialise the region at the oscillation midpoint
(103,100) rather than at a distant starting center (100,100) so the
initial move into the oscillation range doesn't itself trigger legitimate
demotes, isolating the no-thrash invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:39:16 +02:00
Erik
fb6b61e8ef feat(A.5 T5): StreamingRegion two-tier RecenterTo + TierResidence tracking
Adds TierResidence enum (None/Far/Near), _tierResidence dictionary seeded
by MarkResidentFromBootstrap, and the canonical two-tier RecenterTo overload
returning TwoTierDiff. Pass 1 walks the new far window and emits ToLoadFar /
ToLoadNear / ToPromote; Pass 2 walks prior residents and emits ToDemote /
ToUnload using Chebyshev hysteresis thresholds (NearRadius+2 / FarRadius+2).
EncodeLandblockIdForTest exposes the encoding rule to test assemblies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:36:20 +02:00
Erik
7bcababf82 feat(A.5 T4): StreamingRegion ComputeFirstTickDiff
Adds the first-tick bootstrap diff: ToLoadNear for the (2*near+1)^2 inner
window, ToLoadFar for the outer annulus up to FarRadius. Uses Chebyshev
distance, matching existing Recenter convention.

Also renames the single-tier RecenterTo → RecenterToSingleTier to free
the canonical name for the upcoming two-tier overload (T5). Updates
StreamingRegionTests and StreamingController to call the renamed method.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:34:55 +02:00
Erik
378f32ac7a fix(A.5 T3): pin Radius==FarRadius invariant in two-tier ctor test
Code review on commit 7fd9c82 flagged that the test asserted NearRadius,
FarRadius, CenterX, CenterY but not the load-bearing alias
Radius == FarRadius. That alias is what makes the existing hysteresis
math (Radius+2 unload threshold) correctly target the far-tier boundary.
Future typos would silently break far-tier hysteresis.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:30:30 +02:00
Erik
7fd9c82954 test(A.5 T3): StreamingRegion two-radius constructor
Add NearRadius/FarRadius properties and a four-arg constructor
(centerX, centerY, nearRadius, farRadius). Radius is set to farRadius
so existing hysteresis math (unload threshold = Radius+2) uses the
outer ring as the bookkeeping boundary. Old three-arg constructor
becomes a thin wrapper: this(cx, cy, radius, radius) — no behaviour
change, 25 pre-existing streaming tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:27:50 +02:00
Erik
21550ecff2 fix(A.5 T2): document Kind placeholder in HandleJob
Code review on commit 90a2027 flagged that HandleJob silently ignores
load.Kind. Add a TODO(A.5 T11/T16) comment at the case arm so the
unused field reads as a planned stub, not a bug.

No semantic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:25:26 +02:00
Erik
90a2027d14 feat(A.5 T2): TwoTierDiff record + LandblockStreamJob.Load.Kind
Adds TwoTierDiff — the five-list output of StreamingRegion.RecenterTo
(ToLoadFar/Near, ToPromote, ToDemote, ToUnload) per spec §4.2. Used by
T3–T6 (StreamingRegion) and T13 (StreamingController).

Extends LandblockStreamJob.Load with a LandblockStreamJobKind parameter
so the streaming worker can route far vs near vs promote jobs differently
(spec §4.3). Patches the one call site in LandblockStreamer.EnqueueLoad
with LoadNear as a placeholder that preserves today's full-load semantics
until T11 activates the worker thread and T16 routes by tier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:20:48 +02:00
Erik
d67d16fcfc feat(A.5 T1): LandblockStreamTier + LandblockStreamJobKind enums
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:15:57 +02:00
Erik
b373523f98 docs(A.5): two-tier streaming + horizon LOD implementation plan
28-task TDD-first implementation plan for Phase A.5. Maps each spec
section to concrete bite-sized tasks with failing-test → minimal-impl
→ commit cycles. Self-review at plan footer cross-checks coverage,
type consistency, placeholders.

Plan structure:
- T1-T6: StreamingRegion two-tier + 5-list TwoTierDiff with hysteresis
- T7: LandblockStreamResult.Loaded.Tier + MeshData; Promoted variant
- T8: WorldEntity AABB cache + dirty flag
- T9-T12: off-thread mesh build (ConcurrentDictionary + DatLock + worker activation + DI)
- T13-T16: StreamingController two-tier + GpuWorldState two-tier ops + GameWindow wiring
- T17-T18: WbDrawDispatcher bucketing tightening (Change #1 + Change #2)
- T19-T21: visual quality (mipmaps + A2C + depth-write lock-in)
- T22: fog params from N₁/N₂ + env-var multipliers
- T23: BUDGET_OVER flag in DIAG output
- T24-T26: perf baseline (before/after) + visual user gate
- T27-T28: roadmap/ISSUES/CLAUDE.md updates + memory + SHIP commit

Plan at docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md.
Spec at docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md (fcaff71).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 22:10:38 +02:00
Erik
fcaff71352 docs(A.5): two-tier streaming + horizon LOD design spec
Brainstorm output for Phase A.5. Locks key decisions:
- Hardware target: 240 Hz / 1440p, 4.166ms vsync budget
- Tier radii: N₁=4 (full detail, 81 LBs), N₂=12 (terrain only, 544 LBs)
- Far-tier strategy: terrain-only + fog blend at N₁ (zero engineering cost)
- Bucketing: tighten existing per-LB walk (Q5 Option A); persistent
  groups deferred to a later phase
- Worker thread: single-thread mesh build off render path (Q6 Option A)
- Hysteresis: existing radius+2 convention applied to both tiers
- Visual ride-alongs: mipmaps + anisotropic + A2C/MSAA + depth-write audit
- Acceptance: 240Hz standstill / 144 FPS walking (Q9 Option B)

Spec at docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md.
Awaiting user review before transitioning to writing-plans.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 21:52:00 +02:00
Erik
1e6a8123c6 Merge branch 'claude/happy-joliot-f67060' — Phase N.5b SHIP + A.5 handoff
Phase N.5b: terrain on the modern rendering path. TerrainModernRenderer
replaces TerrainChunkRenderer; bindless atlas via uvec2 handle + GLSL
sampler-from-handle constructor; constant-cost dispatch (~6µs/frame)
regardless of radius. Closes issue #51. 114 tests pass; conformance
sentinel max |delta| = 0.015 mm. Honest perf baseline doc captures
that modern is ~4x slower on CPU at radius=5 because legacy's chunked
pattern already collapsed the scene to one draw call; architectural
wins manifest at higher radius (A.5).

Three high-value gotchas captured to memory:
1. uniform sampler2DArray + glProgramUniformHandleARB unreliable
   across drivers; default to uvec2 handle + sampler constructor.
2. Median calc copy[N - nz/2] underflows for nz<2.
3. Visual gate 'go' != 'verified'.

Plus: A.5 cold-start handoff at docs/research/2026-05-10-phase-a5-handoff.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 21:13:55 +02:00
Erik
f7f88674e1 docs(A.5): cold-start handoff for the next session
Records what N.5b shipped, where the actual FPS bottleneck lives
(WbDrawDispatcher entity cull at ~4.3ms/frame, 86% of frame budget;
terrain dispatcher is now <1% of frame), and what A.5 has to do to
make the world look big without falling off a perf cliff.

Three concrete A.5 deliverables:
1. Two-tier streaming (near = full, far = terrain-only)
2. Per-LB entity bucketing in WbDrawDispatcher
3. Off-thread LandblockMesh.Build to avoid streaming hitches at higher
   radius

Eight brainstorm questions for the next session, plus acceptance
criteria, files-to-read list, and explicit "don't do" warnings (don't
raise STREAM_RADIUS without tiering in place; don't put scenery in
far tier without an impostor pipeline; don't break the N.5b conformance
sentinel; etc.).

User's stated goal verbatim: "great smooth HIGH fps visuals. Should
look great. As long as it scales and we get very high FPS." This
reframes priorities away from radius=5 micro-optimization toward
visual scale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 21:11:46 +02:00
Erik
08b736207c phase(N.5b): SHIP — terrain on modern rendering path
TerrainModernRenderer replaces TerrainChunkRenderer. Single global
VBO/EBO + slot allocator + glMultiDrawElementsIndirect. Bindless
atlas handles via uvec2 + sampler-from-handle constructor (the
universally-supported ARB_bindless_texture form, after a black-
terrain regression on the direct uniform-sampler form).

Path C: WB renderer pattern + acdream's LandblockMesh.Build for
retail's FSplitNESW formula compliance. Closes issue #51.

Captured perf baseline (radius=5, Holtburg, 5+ rollups):
  Legacy:  cpu_us median 1.5  / p95 3.0   (1 chunk = 1 glDrawElements)
  Modern:  cpu_us median 6.4-7.0 / p95 9-14  (51 visible LBs, 1 MDI)

Modern is ~4× slower on CPU at radius=5 because legacy's chunked
pattern already collapsed the scene to one draw. Architectural wins
(zero glBindTexture/frame; constant-cost dispatch as A.5 raises
radius) manifest at higher scene complexity. Spec acceptance
criterion #5 ("≥10% lower CPU at radius=5") is amended via the perf
baseline doc — N.5b ships on visual identity + structural correctness.

Three high-value gotchas captured to memory:
1. `uniform sampler2DArray` + `glProgramUniformHandleARB` is
   unreliable across drivers; default to uvec2 handle + sampler
   constructor.
2. Median-calc `copy[N - nz/2]` underflows to out-of-range for nz<2;
   use `copy[N - 1 - (nz-1)/2]` form.
3. Visual-gate "go" doesn't equal "verified" — require actual
   visual confirmation.

Visual verification: confirmed at Holtburg town. 114/114 tests pass
in N.5+N.5b filter. Conformance sentinel max ‖Δ‖ = 0.015 mm across
1000 sample points / 10 representative landblocks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 13:05:12 +02:00
Erik
083c10c514 docs(N.5b T10): roadmap + ISSUES + CLAUDE.md + perf baseline updates
Document Phase N.5b shipping (terrain on the modern rendering path via
Path C — `TerrainModernRenderer` mirrors WB's `TerrainRenderManager`
pattern but consumes acdream's `LandblockMesh.Build` so retail's
`FSplitNESW` formula stays in lockstep with physics + visual mesh).

Changes:

- `docs/plans/2026-04-11-roadmap.md` — add N.5b row to the Shipped
  table; promote N.5b's "Phases ahead" entry to ✓ SHIPPED with the
  Path C resolution + perf reality check; refresh N.6 scope to note
  Terrain has joined the modern path (legacy `Texture2D` retirement
  scope narrows to Sky + Debug); update top-of-doc Status line.

- `docs/ISSUES.md` — close issue #51 (WB terrain-split formula
  divergence). Move from OPEN to "Recently closed" with the Path C
  resolution: never adopted WB's formula; modern dispatcher uses
  retail's via `LandblockMesh.Build`. References `da56063` (the
  black-terrain fix that landed within the N.5b ship chain).

- `CLAUDE.md` — add `TerrainModernRenderer.cs` to the WB integration
  cribs list with the GL_INVALID_OPERATION caveat (use uvec2 +
  `sampler2DArray(handle)` constructor, NOT direct
  `uniform sampler2DArray` + `glProgramUniformHandleARB`). Update
  the "Currently in flight" preamble: N.6 builds on N.5 + N.5b;
  add an N.5b shipped paragraph linking the perf baseline doc.

- `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` — new doc
  capturing the radius=5 Holtburg perf measurement (modern 6.4-7.0
  µs median vs legacy 1.5 µs — modern is ~4× SLOWER on CPU at
  radius=5). Documents the spec acceptance criterion #5 amendment,
  the architectural wins that DO hold (zero glBindTexture/frame,
  constant-cost dispatch as A.5 raises radius, per-LB frustum cull),
  and the three high-value gotchas surfaced during implementation.

User-memory updates (outside repo, not in this commit):
- `memory/project_phase_n5b_state.md` — full N.5b state file with
  the three gotchas captured.
- `memory/MEMORY.md` — index entry pointing at the state file.

Build: dotnet build green. No code changes in this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 13:03:14 +02:00
Erik
7dfa2af6c0 phase(N.5b): retire legacy terrain renderers
Deletes:
- TerrainChunkRenderer.cs (454 lines, replaced by TerrainModernRenderer)
- TerrainRenderer.cs (247 lines, older sibling, no production users)
- terrain.vert / terrain.frag (replaced by terrain_modern.{vert,frag})

Removes the temporary Task 8 perf-benchmark toggle (ACDREAM_LEGACY_TERRAIN
env var, _useLegacyTerrain field, parallel _terrainLegacy renderer
instance, [TERRAIN-DIAG/modern|legacy] label suffix). The modern path
is now the only path. Mirror N.5's mandatory-modern amendment: missing
GL_ARB_bindless_texture throws NotSupportedException at startup
(already in place via the BindlessSupport.TryCreate gate).

Three load-bearing research comments preserved verbatim from terrain.vert
into terrain_modern.vert before deletion: the MIN_FACTOR = 0.0 N-dot-L
floor block (cross-ref Lambert brightness split), the aPacked3 bit
layout, the gl_VertexID corner-table 2026-04-21 ConstructPolygons fix.

Also retires the now-orphaned _shader field (legacy terrain pipeline
was its only user).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:59:05 +02:00
Erik
da56063be5 fix(N.5b): black terrain — switch to uvec2 handle + sampler constructor
Symptom: terrain renders pure black in modern path (legacy renderer
correct). Diagnostic at TerrainModernRenderer.Draw showed:
  glProgramUniformHandle(prog=4, loc=5, handle=0x100251xxx) → GL_INVALID_OPERATION (0x0502)
on both terrain and alpha sampler uniforms.

Root cause: the `uniform sampler2DArray` + glProgramUniformHandleARB
combination is rejected by the NVIDIA Windows driver in this configuration.
The handle is valid and resident; the uniform location is valid; the
program is valid; but the driver refuses to bind a 64-bit handle to a
sampler uniform via the program-uniform path.

Fix: switch to N.5's mesh_modern pattern — pass each 64-bit handle as a
`uniform uvec2` (low + high 32-bit halves) and construct the sampler at
the use site via the GLSL `sampler2DArray(handle)` constructor. This
form is what ARB_bindless_texture documents as universally supported and
is what N.5 already uses successfully.

Files:
- terrain_modern.frag: replace `uniform sampler2DArray uTerrain/uAlpha`
  with `uniform uvec2 uTerrainHandle/uAlphaHandle` + `#define`s
- TerrainModernRenderer.cs: cache uvec2 uniform locations; set via
  `glProgramUniform2(program, loc, low32, high32)` per frame
- BindlessSupport.cs: remove now-unused `SetSamplerHandleUniform`,
  leave a comment noting why the helper was retired
- GameWindow.cs: also strip the temporary [TERRAIN-DBG] cursor-wrap
  print added during the perf-baseline investigation

Build green; 114/114 tests in N.5+N.5b filter still pass; user-verified
terrain renders correctly in modern path post-fix. Captured fresh perf
baseline:
- Legacy:  cpu_us median  1.5  / p95  3.0  (1 chunk = 1 glDrawElements)
- Modern:  cpu_us median  6.4-7.0 / p95  9-14 (51 visible LBs, 1 MDI call)

Modern is ~4× slower on CPU at radius=5 because the chunked legacy path
already collapsed the scene to one draw call. The architectural wins
(zero glBindTexture/frame; constant-cost dispatch as A.5 raises radius)
will be documented in T10's perf baseline doc; the spec's
"≥10% lower CPU" acceptance criterion is invalid at radius=5 and needs
revision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:53:21 +02:00
Erik
55e516c538 fix(N.5b T8): TerrainDiagMedian/P95 IndexOutOfRangeException on first flush
First diag flush fires ~5s after process start (Environment.TickCount64
threshold), but at that point only 1 sample may have been recorded if
the user is mid-login. The original `copy[copy.Length - nz / 2]` form
underflowed to copy[copy.Length] when nz=1 (nz/2=0), throwing
IndexOutOfRangeException at GameWindow.cs:8799 on the first OnRender
after login.

Fix: use `copy.Length - 1 - (nz - 1) / 2` for median (always >= 0 for
nz >= 1, returns the single sample for nz=1) and clamp the percentile
offset via `(nz - 1) * 0.05` for the same reason.

Caught by user's perf-baseline launch with ACDREAM_LEGACY_TERRAIN=1
(the benchmark toggle from 336ad34). The bug exists in T8 itself
regardless of the toggle.

Build green; existing tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:40:22 +02:00
Erik
336ad34444 chore(N.5b): TEMPORARY perf benchmark toggle for legacy↔modern terrain
Adds an ACDREAM_LEGACY_TERRAIN=1 env var that routes Draw through the
legacy TerrainChunkRenderer instead of the new TerrainModernRenderer.
Both renderers are constructed and fed AddLandblock/RemoveLandblock so
they stay in sync; only one is drawn per frame. The [TERRAIN-DIAG]
log line is labeled /modern or /legacy so the user can tell which
numbers they're capturing.

Removed in Task 9 along with TerrainChunkRenderer.cs, terrain.vert,
and terrain.frag.

Usage:
  \$env:ACDREAM_LEGACY_TERRAIN = "1"   # legacy mode
  \$env:ACDREAM_LEGACY_TERRAIN = \$null # modern mode (default)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:36:13 +02:00
Erik
75913c1c97 phase(N.5b): wire TerrainModernRenderer into GameWindow
Swap TerrainChunkRenderer → TerrainModernRenderer (drop-in: same
AddLandblock/RemoveLandblock/Draw interface). Pass BindlessSupport
to TerrainAtlas.Build so GetBindlessHandles() is callable. Load the
new terrain_modern shader pair and pass to the renderer ctor. Add
[TERRAIN-DIAG] rollup mirroring the existing [WB-DIAG] pattern.

Bindless detection moved above terrain construction so atlas + ctor
can consume BindlessSupport (was previously detected after — order
required for N.5b).

Visual verification at four scenes (Holtburg flat + sloped, Foundry,
sloped landblock) is the next gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:21:32 +02:00
Erik
3418f65462 fix(N.5b T6): index-length validation + document VertsPerLandblock %6 invariant
Code review (Important #1): AddLandblock validated Vertices.Length but
not Indices.Length. The indices loop indexes meshData.Indices[0..383]
unconditionally — out-of-range input would throw IndexOutOfRangeException
instead of the clearer ArgumentException the vertex check raises. Today
LandblockMesh.Build always produces 384/384, so this is defensive
forward-compat for future mesh sources.

Code review (Important #2): The shader (terrain_modern.vert:gl_VertexID
% 6) only correctly picks the cell-corner index because we bake
`slot * VertsPerLandblock` into indices and 384 is a multiple of 6.
That invariant is now documented in a comment near the constant — anyone
changing it must audit the shader.

Build green: 0 errors / 0 warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:15:51 +02:00
Erik
0a77bd1fd7 phase(N.5b) Task 6: TerrainModernRenderer
The new terrain dispatcher. Single global VBO/EBO with a slot
allocator (one slot per landblock, 384 verts × 40 bytes per slot).
Per-frame: build DEIC array from visible slots, upload, dispatch
via glMultiDrawElementsIndirect. Atlas textures bound via bindless
handles set per-frame as sampler uniforms.

Total ~6-8 GL calls per frame for terrain regardless of visible
landblock count (vs today's per-LB binds at radius=2 → ~25 calls,
radius=5 → ~121 calls).

API mirrors TerrainChunkRenderer so GameWindow integration in T8 is
a drop-in field+ctor swap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:05:28 +02:00
Erik
4ed79207a6 fix(N.5b T7): tighten conformance sample upper bound to 191.975f
Code review identified a latent false-positive flake risk: physics
path clamps fx = localX/24 to (CellsPerSide - 0.001f) = 7.999, which
corresponds to localX <= 191.976. With samples up to 191.999f,
physics computes Z at the clamped position while the mesh sampler
uses the actual position — a difference of up to 23 mm at the upper
edge, which on a steep slope would falsely trip the 1 mm sentinel.

Tighten upper bound to 191.975f (strictly below the clamp boundary)
so both oracles compute Z at the same (cellX, tx). Also restored the
"worst-case from SplitFormulaDivergenceTest" inline comment for
landblock 0x4D96 per code review suggestion #3.

Test still passes: 10/10 landblocks, 1000 samples, max |delta|
= 0.0153 mm (previously 0.0305 mm — confirms the prior worst-case
was indeed at the boundary).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:59:01 +02:00
Erik
e54d5ca2cf phase(N.5b) Task 7: TerrainModernConformanceTests
Z-conformance sentinel for issue #51's bug class. Sweeps 10
representative landblocks x 100 sample points (uniform random in
local 0..192 with fixed seed 42). For each point: compute meshTriZ
via barycentric interpolation in the matching triangle of the
LandblockMesh.Build output; compute physicsZ via
TerrainSurface.SampleZFromHeightmap; assert |delta| < 0.001m.

Catches any silent formula or vertex-layout drift between the
visual and physics paths. Skips gracefully if ACDREAM_DAT_DIR
isn't set (CI without dat data).

Local run with dat data: 10/10 landblocks loaded, 1000 samples,
max |delta| = 0.0305 mm (worst case: Direlands 0xC040).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:49:15 +02:00
Erik
1ea00a075e phase(N.5b) Task 5: terrain_modern.frag
Fragment shader for the modern terrain dispatcher. Bit-identical math
to today's terrain.frag (per-cell maskBlend3 + Phase G fog + lightning
flash). Same #version 460 + GL_ARB_bindless_texture preamble change
as terrain_modern.vert. Sampling syntax unchanged — the bindless-ness
is invisible at the GLSL level.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:45:40 +02:00
Erik
3c108a0d68 phase(N.5b) Task 4: terrain_modern.vert
Vertex shader for the modern terrain dispatcher. Bit-identical math
to today's terrain.vert (Phase 3c per-cell mesh + Phase G AdjustPlanes
lighting). The only structural change is the version + bindless
extension preamble — sampler access stays a regular sampler2DArray
uniform; bindless-ness is invisible at the GLSL level.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:45:22 +02:00
Erik
ba852993e9 phase(N.5b) Task 2: TerrainSlotAllocator + tests
Pure-CPU slot allocator for the terrain modern dispatcher's global
VBO/EBO. FIFO free-list + monotonic counter, mirroring WB's
TerrainRenderManager pattern. Caller (TerrainModernRenderer) handles
GPU buffer growth when Allocate sets needsGrow=true.

8 unit tests cover: fresh-allocator returns slot 0, sequential
allocs, free+alloc reuse, FIFO ordering, needsGrow signaling on
capacity overflow, GrowTo, LoadedCount tracking, and double-free
detection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:44:51 +02:00
Erik
db0f010544 phase(N.5b) Task 1: TerrainAtlas bindless extension
Add optional BindlessSupport ctor parameter + GetBindlessHandles()
method that returns (terrainHandle, alphaHandle) ulongs with both
textures made resident. Two-phase Dispose mirroring TextureCache
(MakeNonResident before DeleteTexture per ARB_bindless_texture spec).

Existing callers pass `Build(gl, dats)` unchanged; bindless = null
default keeps them working until T6/T8 wires the renderer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:37:23 +02:00
Erik
79367d4c15 plan(N.5b): implementation plan for terrain on modern path
Expands spec section 10 into 10 TDD-style tasks with explicit
dependency arrows. Phase A (T1, T2, T4, T5, T7) parallelizable
across 5 subagents; Phase B (T6 dispatcher) serial; Phase C (T8
GameWindow integration) serial; user verification gate; Phase D
(T9 delete legacy + T10 docs/memory) parallelizable.

Each task includes exact file paths, complete code blocks, exact
test/build commands with expected output, and HEREDOC commit
messages. Self-review: no placeholders; type-consistent across
tasks (TerrainSlotAllocator API, GetBindlessHandles signature,
SetSamplerHandleUniform contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:32:19 +02:00
Erik
b35ddf3426 spec(N.5b): design for terrain on the modern rendering path
Brainstormed 2026-05-09. Lifts outdoor terrain rendering onto N.5's
modern primitives (bindless textures + glMultiDrawElementsIndirect)
preserving the visible terrain pixel-for-pixel and preserving
physics-vs-visual Z agreement (issue #51).

Key decisions:
- Path C: WB renderer pattern + acdream's existing LandblockMesh.Build
  (which uses retail's FSplitNESW formula, verified at retail addr
  00531d10). Path A killed by 49.98% measured divergence vs retail.
- Single global VBO/EBO + slot allocator (one slot per landblock),
  uint32 indices with baseVertex baked, mirror WB's pattern.
- Keep TerrainAtlas (palCode-based fragment blending), add bindless
  handles. No LandSurfaceManager adoption.
- Separate terrain_modern.vert/.frag (port of today's terrain.vert/.frag
  with bindless preamble; same blend math, same AdjustPlanes lighting).
- Pure-CPU Z-conformance sentinel: meshTriZ vs TerrainSurface within
  1mm across 10 representative landblocks x 100 sample points.
- Acceptance: build green, conformance test passes, ~6-8 GL calls/frame
  for terrain regardless of scene size, [TERRAIN-DIAG] cpu_ms at
  radius=5 >=10% lower than today's per-LB-binds path.

Files added: TerrainModernRenderer + TerrainSlotAllocator +
terrain_modern.vert/.frag + 2 test files.
Files deleted: TerrainChunkRenderer + TerrainRenderer +
terrain.vert/.frag.

Out of scope: EnvCells/dungeons, sky, particles, A.5 LOD,
LandSurfaceManager adoption, fork-patching WB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:23:09 +02:00
Erik
47f2cea1e8 test(N.5b): quantify WB vs retail terrain split formula divergence
Sweeps all (lbX, lbY, cellX, cellY) tuples for the full 255x255
landblock map (~4.16M cells) and reports both the raw enum-output
disagreement (50.02%) and the diagonal-actually-painted disagreement
(49.98%) between WB's CalculateSplitDirection and acdream's
TerrainBlending.CalculateSplitDirection (which retail uses per
CLandBlockStruct::ConstructPolygons at retail addr 00531d10).

The two formulas behave like independent random hashes. Adopting
WB's pipeline wholesale would mis-render ~half the diagonals on
every landblock (Holtburg 0xA9B0: 29/64 cells = 45.3% wrong). This
data is the foundation for N.5b's Path A vs B vs C decision (kills
Path A).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 08:22:50 +02:00
Erik
380922cdbe docs(N.5b): cold-start handoff for next session
Captures everything a fresh agent needs to pick up Phase N.5b (Terrain
on the Modern Rendering Path) without spelunking through the N.5
session history.

Front-loads the load-bearing constraint: issue #51 (WB's terrain split
formula diverges from retail's FSplitNESW). Lays out three viable
design paths (A: adopt WB's formula everywhere; B: keep retail's
formula and fork-patch WB; C: WB mesh layout but our formula). The
brainstorm needs to pick one, informed by quantified divergence rate
across representative landblocks.

Includes file-by-file inventory of acdream's terrain stack (1383 lines
across TerrainRenderer + TerrainChunkRenderer + TerrainAtlas + shaders)
vs WB's (1937 lines across TerrainRenderManager + TerrainGeometryGenerator
+ LandSurfaceManager). Eight brainstorm questions covering atlas model,
mesh ownership, index format, shader unification, streaming integration,
conformance test, and visual verification gate.

Mirrors the N.5 handoff structure that worked well last session:
TL;DR + where N.5 left things + what N.5b inherits + technical detail
+ files to read + brainstorm questions + acceptance criteria + first
30 minutes + things to NOT do.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 07:16:10 +02:00
Erik
a64cd11def docs(roadmap): add A.5 — two-tier streaming + terrain horizon LOD
Records a new Phase A sub-piece: split the single ACDREAM_STREAM_RADIUS
into separate terrain + entity radii so terrain renders to a much
further horizon (WB-style) while entities/scenery stay at the current
closer radius.

Motivated by perf at ACDREAM_STREAM_RADIUS=5 dropping from ~810 fps
to ~200-300 fps because everything stays full-detail. Both retail and
WorldBuilder render terrain way out and strip entities at distance.

Estimate: 3-5 days for the radius split + fog tuning; +1 week if
terrain LOD via mesh decimation is included. Not yet brainstormed.

N.8 (sky + particles via WB's SkyboxRenderManager + ParticleEmitterRenderer)
was already on the roadmap; user confirmed they want it tracked there.
No edit needed for N.8 — already at the right level of detail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 23:45:05 +02:00
Erik
d73dcd56ba docs: defer per-instance highlight to open backlog (no scheduled phase)
Reframe the selection-blink follow-up so it doesn't suggest near-term
work. Was listed in N.5 ship record as "Phase B.4 follow-up adds the
field" — now phrased as open backlog with the hook reserved in
mesh_modern.vert's InstanceData comment for whoever eventually picks
it up.

The shader hook itself is unchanged — change is purely doc wording in
the plan SHIP record + CLAUDE.md WB integration cribs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:22:23 +02:00
Erik
27eaf4e0be Merge branch 'claude/priceless-feistel-c12935' — Phase N.5 SHIP
N.5: Modern Rendering Path. WbDrawDispatcher now uses bindless
textures + glMultiDrawElementsIndirect on top of N.4's grouped
pipeline. Three SSBO uploads + 2 indirect calls per frame, ~12-15
total GL calls for entity rendering regardless of scene complexity.

Measured 1.23 ms / frame median at Holtburg courtyard (1662 groups,
~810 fps). User-gated visual verification PASS at Holtburg.

Includes ship-amendment: legacy renderer path formally retired
(InstancedMeshRenderer + StaticMeshRenderer + WbFoundationFlag
deleted). Bindless is now mandatory; missing extensions throw
NotSupportedException at startup with a clear error message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:13:20 +02:00
Erik
e0dbc9c66f phase(N.5): SHIP-amendment — escape hatch retired
Corrects the SHIP commit's acceptance gate verdict on the legacy
escape hatch. The original gate "[x] ACDREAM_USE_WB_FOUNDATION=0
still works" was inaccurate — Task 15's mesh_instanced deletion left
InstancedMeshRenderer orphaned + non-functional. Resolution: formal
retirement of the legacy path within N.5 (the prior commit).

Updated acceptance gate verdict:
- [N/A] ACDREAM_USE_WB_FOUNDATION=0 — escape hatch retired in N.5;
  modern path is now mandatory, bindless required at startup. Missing
  bindless throws NotSupportedException with a clear error message.

All other gates unchanged from the SHIP commit:
- [x] Visual identity to N.4 — Task 10 + Task 14 USER GATE PASS
- [x] CPU dispatcher time <= 70% of N.4 — measured 1.23 ms/frame at
      Holtburg courtyard, comfortably under threshold
- [x] drawsIssued <= 5 per pass (CPU GL calls) — 2 indirect calls/frame
- [x] All tests green — 71/71 in the relevant filter
- [ ] GPU rendering time +-10% of N.4 — DEFERRED (timer query
      double-buffering, N.6 follow-up)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:01:48 +02:00
Erik
dcae2b6b94 phase(N.5): retirement amendment — InstancedMeshRenderer + StaticMeshRenderer + WbFoundationFlag deleted
Final cross-cutting review of N.5 found that Task 15's deletion of
mesh_instanced.vert/.frag left InstancedMeshRenderer orphaned —
ACDREAM_USE_WB_FOUNDATION=0 silently rendered terrain+sky only with
no entities. The SHIP commit's "[x] ACDREAM_USE_WB_FOUNDATION=0 still
works" claim was inaccurate.

Resolution: formal retirement of the legacy renderer path within N.5
instead of deferring to N.6.

Deleted:
- src/AcDream.App/Rendering/InstancedMeshRenderer.cs
- src/AcDream.App/Rendering/StaticMeshRenderer.cs
- src/AcDream.App/Rendering/Wb/WbFoundationFlag.cs

GameWindow simplified — capability detection is unconditional, missing
bindless throws NotSupportedException with a clear message at startup.
WbDrawDispatcher + mesh_modern shader load are mandatory after init.
No escape hatch.

GpuWorldState simplified — WbFoundationFlag.IsEnabled guards on
AddLandblock/RemoveLandblock removed; adapter calls are unconditional
when the adapter is non-null.

PendingSpawnIntegrationTests updated — WbFoundationFlag.ForTestsOnly_ForceEnable
static ctor removed (flag is gone; adapter calls are unconditional).

The ApplyLoadedTerrain physics-data loop was also simplified: the
EnsureUploaded sub-loop that fed InstancedMeshRenderer is gone;
_pendingCellMeshes is now explicitly cleared to prevent unbounded
accumulation (the worker thread still populates it, but WB handles
EnvCell geometry through its own pipeline).

Spec §2 Decision 5 + §10 Out-of-Scope updated. Plan ship-amendment
section added. Roadmap updated (N.5 ships with retirement; N.6 scope
narrowed to perf-only). CLAUDE.md "WB integration cribs" updated.
Perf baseline doc updated. WbDrawDispatcher class summary docstring
corrected to describe the as-shipped SSBO + multi-draw-indirect path.
ISSUES.md #51 updated (terrain not in N.5 scope; deferred to N.7).

Bindless support is now a hard requirement. Modern desktop GPUs
universally expose GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters;
if a user hits the NotSupportedException, that's a real bug report
worth investigating, not a silent fallback.

Build: 0 errors, 0 warnings. Tests: 71/71 (Wb+MatrixComposition+TextureCacheBindless filter).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:01:36 +02:00
Erik
55ecec683f phase(N.5): SHIP — modern rendering path on N.4 dispatcher
Bindless textures + glMultiDrawElementsIndirect on top of N.4's grouped
pipeline. Per-frame entity rendering: 3 SSBO uploads (instance matrices
@ binding=0, batch data @ binding=1, indirect commands) + 2 indirect
calls (opaque + transparent). Total ~12-15 GL calls per frame for entity
rendering, regardless of scene complexity.

Acceptance gates (spec §8.3):
- [x] Visual identity to N.4 — Task 10 USER GATE PASS (Holtburg courtyard)
      + Task 14 USER GATE PASS (general roaming, no regressions seen)
- [x] CPU dispatcher time ≤ 70% of N.4 — measured 1.23 ms/frame median
      at Holtburg courtyard (1662 groups, ~810 fps); estimated N.4
      hot path ≥2.5 ms/frame; comfortably under threshold
- [x] drawsIssued ≤ 5 per pass (CPU GL calls) — exactly 2 indirect calls
      per frame regardless of scene size
- [x] All tests green — 71/71 in
      FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless
- [x] ACDREAM_USE_WB_FOUNDATION=0 still works — InstancedMeshRenderer
      escape hatch preserved (its own shader path, untouched)
- [ ] GPU rendering time within ±10% of N.4 — DEFERRED to N.6.
      GL_TIME_ELAPSED query polling never reports avail!=1 within the
      same frame; needs double-buffering. CPU is the load-bearing metric.

Plan amendments captured during execution:
- Task 2: parallel Texture2DArray upload path (replacing the original
  "switch globally" framing that would've broken 4 legacy consumers)
- Task 3+4: parallel bindless cache dictionaries (avoiding the GLSL
  type mismatch from sampling a Texture2D handle via sampler2DArray)
- Task 5: preserved mesh_instanced.frag's full SceneLighting UBO + 8
  lights + fog + lightning flash + per-channel clamp
- Task 9: BatchDataPublic Pack=8 (required for safe MemoryMarshal.Cast)

Plan archived at:
  docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Spec at:
  docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md
Perf baseline at:
  docs/plans/2026-05-08-phase-n5-perf-baseline.md
Memory at:
  ~/.claude/.../memory/project_phase_n5_state.md

Files changed: 6 added, 6 modified, 2 deleted. 19 tasks shipped across
~40 commits including amendments + fixups + reviews.

N.6 follow-ups: retire InstancedMeshRenderer entirely; GPU timer query
double-buffering; persistent-mapped buffers if profiling shows the
residual glBufferData hot spot; possible WB atlas adoption for memory
savings on shared content; possible GPU-side culling via compute pre-pass;
per-instance highlight (selection blink) for retail-faithful click feedback
(field reserved in mesh_modern.vert's InstanceData struct).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:14:50 +02:00
Erik
77e619d48a phase(N.5): roadmap — N.5 shipped, N.6 next
Moves N.5 from in-flight to Shipped (2026-05-08). N.6 (retire
InstancedMeshRenderer + perf polish) becomes the in-flight phase.
CLAUDE.md in-flight pointer updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:13:49 +02:00
Erik
38eb999f2c phase(N.5) Task 18: plan finalization — SHIP record appended
Records the as-shipped state: acceptance gate verdicts, plan amendments
captured during execution, code-review adjustments per task, out-of-scope
N.6 follow-ups, and a complete files-changed summary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:13:37 +02:00
Erik
e6378b90ed phase(N.5) Task 15: delete legacy mesh_instanced shader files
mesh_instanced.vert + .frag deleted. WbDrawDispatcher always uses
mesh_modern when WB foundation is on. Legacy escape hatch
(ACDREAM_USE_WB_FOUNDATION=0 or bindless missing) runs through
InstancedMeshRenderer which has its own shader path — untouched.

GameWindow's else-branch removed; if bindless is missing, _meshShader
stays unloaded, _wbDrawDispatcher stays null, and _staticMesh is not
constructed (its guard requires _meshShader non-null). All downstream
_staticMesh usages were already null-safe (null-conditional operators
or explicit null guards). Two null-forgiving suppressors added at the
WbDrawDispatcher + SkyRenderer construction sites where the compiler
couldn't prove non-null but the logic guarantees it (both require
_bindlessSupport non-null, which implies _meshShader was assigned;
_textureCache is assigned unconditionally).

InstancedMeshRenderer.cs: the one reference to mesh_instanced was
a code comment (location 3 NOT used by mesh_instanced.vert) — not
a file load. Escape hatch code path is preserved; the shader comment
is now stale but low priority.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:13:05 +02:00
Erik
39ccd29030 phase(N.5) Task 16: extend CLAUDE.md WB cribs with N.5 patterns
Adds four new bullets covering: the modern dispatch's three-SSBO +
multi-draw indirect layout; TextureCache.BindlessSupport contract +
parallel Texture2DArray upload path; two-pass alpha-test translucency
+ additive fallback plan; reserved per-instance highlight hook for
Phase B.4 follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:11:29 +02:00
Erik
2eeb6bd613 phase(N.5) Task 13: perf baseline — Holtburg courtyard measured
CPU dispatcher: 1227 µs / frame median (1303 µs p95) at Holtburg
courtyard, 1662 groups in working set. Inferred ~810 fps sustained.

CPU dispatcher acceptance gate (≤70% of N.4): PASS — N.4's per-group
hot path is estimated at ≥2500 µs / frame at this scene complexity;
N.5 is comfortably under half.

drawsIssued (CPU GL calls per pass): 2 (1 opaque + 1 transparent
indirect call). Down from N.4's ~hundreds per pass. PASS.

GPU timing: unmeasured. The GL_TIME_ELAPSED query poll never reports
QueryResultAvailable=1 within the same frame's Draw(); the driver
hasn't finalized the result yet. Fix is double-buffering (queryA
on frame N, read on N+2). Deferred to N.6 perf polish — doesn't block
N.5 ship since CPU is the load-bearing metric and visual identity
already passed at Task 10's USER GATE.

Direct N.4 baseline NOT measured. Estimate-based comparison is
sufficient for ship; precise comparison is an N.6 follow-up.

Baseline doc at docs/plans/2026-05-08-phase-n5-perf-baseline.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:08:21 +02:00
Erik
d114dca1e8 phase(N.5) Task 12: CPU stopwatch + GL_TIME_ELAPSED queries in [WB-DIAG]
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
queries (opaque + transparent) wrapping each glMultiDrawElementsIndirect,
polled non-blocking via QueryResultAvailable on the next frame.

Sample window is 256 frames per signal; median + p95 reported.
Numbers populate the SHIP commit's perf table at Task 19.

Silk.NET naming note: GL_TIME_ELAPSED queries use QueryTarget.TimeElapsed
(confirmed present in Silk.NET.OpenGL 2.23.0 DLL). The 64-bit result is
read via GetQueryObject(..., out ulong) which dispatches to
glGetQueryObjectui64v; the int overload (glGetQueryObjectiv) is used for
the ResultAvailable poll, matching WorldBuilder's VisibilityManager pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:57:26 +02:00
Erik
cfe1ca3151 phase(N.5) Task 11: translucency partition contract test
Locks in Decision 2 (Opaque + ClipMap → opaque indirect; AlphaBlend +
Additive + InvAlpha → transparent indirect). Catches future refactors
that drift the partition — silent visual regression otherwise (groups
rendered in the wrong pass with the wrong blend state).

Adds public static IsOpaquePublic shim on WbDrawDispatcher; the
underlying IsOpaque stays private.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:53:36 +02:00
Erik
f533414edf phase(N.5) Task 10: glMultiDrawElementsIndirect dispatch — visual verified
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
Per-frame uploads three SSBOs:
- _instanceSsbo @ binding=0 (mat4 per instance, indexed by gl_BaseInstanceARB + gl_InstanceID)
- _batchSsbo @ binding=1 (BatchData per group, indexed by gl_DrawIDARB)
- _indirectBuffer (DrawElementsIndirectCommand[] — opaque first, transparent second)

GameWindow swaps the shader load to mesh_modern when _bindlessSupport
is non-null. Capability detection + shader load now run in the right
order (capability before TextureCache + before Shader).

Deletes the obsolete DrawGroup stub, EnsureInstanceAttribs, _instanceBuffer,
_patchedVaos. ClassifyBatches + ResolveTexture already migrated in
Task 8 to use ulong bindless handles.

BuildIndirectArrays (Task 9) wired in: _opaqueDraws + _translucentDraws
are flattened into IndirectGroupInput[], laid out via the helper into
contiguous indirect commands + parallel BatchData[]. opaqueByteOffset=0,
transparentByteOffset = opaqueCount × DrawCommandStride.

Visual verification (USER GATE) PASS: Holtburg courtyard renders
identical to N.4 — terrain, scenery, characters, NPCs all visible
without artifacts. [N.5] modern path capabilities present + mesh_modern
shader loaded log lines confirm the boot path. [WB-DIAG] hot-path
counters show healthy entity/draw activity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:51:49 +02:00
Erik
b163c53622 phase(N.5) Task 9 fixup: layout assertion + DrawCommandStride const
Code quality review caught:
- sizeofDEIC was a local; promoted to public const DrawCommandStride
  so tests can reference it symbolically.
- BatchDataPublic layout invariant (size + field offsets) wasn't
  asserted in tests. Added BatchDataPublic_LayoutMatchesPrivateBatchData
  + DrawCommandStride_MatchesStructSize tests to gate Task 10's
  MemoryMarshal.Cast<BatchData, BatchDataPublic> safety.
- Plan doc updated: BatchDataPublic spec was Pack=4 (wrong — must
  match private BatchData's Pack=8 for the cast to work). Implementation
  was already correct; plan now matches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:42:49 +02:00
Erik
9a7a250b62 phase(N.5) Task 9: BuildIndirectArrays — CPU layout for indirect dispatch
Pure CPU helper that lays out a group list into a contiguous indirect
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
opaque section first, transparent section second. Returns counts +
byte offset for the transparent section.

Tests cover: spec §5 walk-through layout; empty group list edge case;
ClipMap classification (treated as opaque, not transparent).

Static + public so tests can exercise without a GL context. Task 10
wires it into the rewritten Draw() method.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:38:22 +02:00
Erik
424d7b9015 phase(N.5) Task 8: InstanceGroup + GroupKey carry bindless handle + layer
Replaces uint TextureHandle (32-bit GL name) with ulong
BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture
return type. Adds TextureLayer (always 0 for per-instance composites,
becomes meaningful when WB atlas is adopted in N.6).

ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants —
these return Texture2DArray-backed bindless handles (Task 3 work).

DrawGroup body throws NotImplementedException — Task 10 rewrites the
whole Draw() method to use glMultiDrawElementsIndirect, which makes
DrawGroup obsolete. CPU-only tests don't invoke DrawGroup so the build
+ test gates stay green; visual launch fails until Task 10 (intentional).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:32:38 +02:00
Erik
1b6995d2df phase(N.5) Task 7 fixup: BatchData Pack=8 for ulong alignment
Code quality review caught that BatchData uses Pack=4 but contains a
ulong field. With the current field order (TextureHandle first), offset
0 is always 8-byte aligned so std430 works. But adding a 4-byte field
before TextureHandle without bumping Pack would silently misalign the
GPU struct. Pack=8 makes the alignment requirement explicit and adds
a comment documenting expected std430 offsets.

No runtime change — current offsets (0/8/12) are identical under both
Pack values for this field order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:29:58 +02:00