Adds AabbMin/AabbMax (per-entity world-space bounding box) and AabbDirty
flag to WorldEntity. RefreshAabb() recomputes the box from Position ±5 m
(DefaultAabbRadius). SetPosition() writes Position and marks the cache
dirty so the dispatcher calls RefreshAabb on first read rather than
carrying stale bounds.
AabbDirty defaults to true on construction — freshly-built entities have
zero AabbMin/AabbMax until RefreshAabb is called. Two new conformance tests
verify the ±5 m geometry and the dirty/clean state machine.
Per Phase A.5 spec §4.6 Change #2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the Loaded result record with a LandblockStreamTier discriminator
and a LandblockMeshData payload (default! stub — T13 wires the real
off-thread mesh build). Adds the Promoted variant for Far→Near upgrades
that only need the entity layer, not a mesh rebuild.
LandblockStreamer.HandleJob passes Tier.Near + default! MeshData at the
existing synchronous load site; StreamingControllerTests updated to
match the new positional signature.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review on commits 7bcabab/fb6b61e/326b698 flagged 2 Important +
4 Minor issues. Apply all fixes:
Important:
- Two-tier RecenterTo + MarkResidentFromBootstrap now throw
InvalidOperationException on misuse — calling RecenterTo before the
bootstrap silently emitted the entire window as fresh loads (no
demotes/unloads since _tierResidence was empty), a correctness hazard
that produced no exception. Calling MarkResidentFromBootstrap twice
silently dropped accumulated tier state. Both now crash loudly via
a _bootstrapped flag.
- Dropped TierResidence.None from the enum — never assigned, never
checked; absence from the dictionary already encodes "not resident."
Minor:
- Renamed test: RecenterTo_FirstTick_* → ComputeFirstTickDiff_FirstTick_*
(the test calls ComputeFirstTickDiff, not RecenterTo).
- Strengthened RecenterTo_PlayerWalks_NullToFar_* with assertions for
ToPromote.Count==3 (the x=102 column promoting Far→Near) and
ToUnload.Empty (everything within hysteresis).
- Replaced System.Math.Abs with Math.Abs in new code to match the
file's existing `using System;` convention.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 5 tests to StreamingRegionTwoTierTests covering all tier-transition paths:
- FarToNear promote (walk 2 east from initial center)
- NullToNear teleport (loads 9 near + 40 far for a fully fresh region)
- NearToFar demote only after NearRadius+2 hysteresis threshold
- FarToNull unload only after FarRadius+2 hysteresis threshold
- oscillation no-thrash: bouncing 1 LB across a near boundary fires 0 demotes
and at most 5 promotes total (one initial settle of the x=100 near-column)
Oscillation test fix: initialise the region at the oscillation midpoint
(103,100) rather than at a distant starting center (100,100) so the
initial move into the oscillation range doesn't itself trigger legitimate
demotes, isolating the no-thrash invariant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds TierResidence enum (None/Far/Near), _tierResidence dictionary seeded
by MarkResidentFromBootstrap, and the canonical two-tier RecenterTo overload
returning TwoTierDiff. Pass 1 walks the new far window and emits ToLoadFar /
ToLoadNear / ToPromote; Pass 2 walks prior residents and emits ToDemote /
ToUnload using Chebyshev hysteresis thresholds (NearRadius+2 / FarRadius+2).
EncodeLandblockIdForTest exposes the encoding rule to test assemblies.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the first-tick bootstrap diff: ToLoadNear for the (2*near+1)^2 inner
window, ToLoadFar for the outer annulus up to FarRadius. Uses Chebyshev
distance, matching existing Recenter convention.
Also renames the single-tier RecenterTo → RecenterToSingleTier to free
the canonical name for the upcoming two-tier overload (T5). Updates
StreamingRegionTests and StreamingController to call the renamed method.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review on commit 7fd9c82 flagged that the test asserted NearRadius,
FarRadius, CenterX, CenterY but not the load-bearing alias
Radius == FarRadius. That alias is what makes the existing hysteresis
math (Radius+2 unload threshold) correctly target the far-tier boundary.
Future typos would silently break far-tier hysteresis.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add NearRadius/FarRadius properties and a four-arg constructor
(centerX, centerY, nearRadius, farRadius). Radius is set to farRadius
so existing hysteresis math (unload threshold = Radius+2) uses the
outer ring as the bookkeeping boundary. Old three-arg constructor
becomes a thin wrapper: this(cx, cy, radius, radius) — no behaviour
change, 25 pre-existing streaming tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review on commit 90a2027 flagged that HandleJob silently ignores
load.Kind. Add a TODO(A.5 T11/T16) comment at the case arm so the
unused field reads as a planned stub, not a bug.
No semantic change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds TwoTierDiff — the five-list output of StreamingRegion.RecenterTo
(ToLoadFar/Near, ToPromote, ToDemote, ToUnload) per spec §4.2. Used by
T3–T6 (StreamingRegion) and T13 (StreamingController).
Extends LandblockStreamJob.Load with a LandblockStreamJobKind parameter
so the streaming worker can route far vs near vs promote jobs differently
(spec §4.3). Patches the one call site in LandblockStreamer.EnqueueLoad
with LoadNear as a placeholder that preserves today's full-load semantics
until T11 activates the worker thread and T16 routes by tier.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase N.5b: terrain on the modern rendering path. TerrainModernRenderer
replaces TerrainChunkRenderer; bindless atlas via uvec2 handle + GLSL
sampler-from-handle constructor; constant-cost dispatch (~6µs/frame)
regardless of radius. Closes issue #51. 114 tests pass; conformance
sentinel max |delta| = 0.015 mm. Honest perf baseline doc captures
that modern is ~4x slower on CPU at radius=5 because legacy's chunked
pattern already collapsed the scene to one draw call; architectural
wins manifest at higher radius (A.5).
Three high-value gotchas captured to memory:
1. uniform sampler2DArray + glProgramUniformHandleARB unreliable
across drivers; default to uvec2 handle + sampler constructor.
2. Median calc copy[N - nz/2] underflows for nz<2.
3. Visual gate 'go' != 'verified'.
Plus: A.5 cold-start handoff at docs/research/2026-05-10-phase-a5-handoff.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Records what N.5b shipped, where the actual FPS bottleneck lives
(WbDrawDispatcher entity cull at ~4.3ms/frame, 86% of frame budget;
terrain dispatcher is now <1% of frame), and what A.5 has to do to
make the world look big without falling off a perf cliff.
Three concrete A.5 deliverables:
1. Two-tier streaming (near = full, far = terrain-only)
2. Per-LB entity bucketing in WbDrawDispatcher
3. Off-thread LandblockMesh.Build to avoid streaming hitches at higher
radius
Eight brainstorm questions for the next session, plus acceptance
criteria, files-to-read list, and explicit "don't do" warnings (don't
raise STREAM_RADIUS without tiering in place; don't put scenery in
far tier without an impostor pipeline; don't break the N.5b conformance
sentinel; etc.).
User's stated goal verbatim: "great smooth HIGH fps visuals. Should
look great. As long as it scales and we get very high FPS." This
reframes priorities away from radius=5 micro-optimization toward
visual scale.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TerrainModernRenderer replaces TerrainChunkRenderer. Single global
VBO/EBO + slot allocator + glMultiDrawElementsIndirect. Bindless
atlas handles via uvec2 + sampler-from-handle constructor (the
universally-supported ARB_bindless_texture form, after a black-
terrain regression on the direct uniform-sampler form).
Path C: WB renderer pattern + acdream's LandblockMesh.Build for
retail's FSplitNESW formula compliance. Closes issue #51.
Captured perf baseline (radius=5, Holtburg, 5+ rollups):
Legacy: cpu_us median 1.5 / p95 3.0 (1 chunk = 1 glDrawElements)
Modern: cpu_us median 6.4-7.0 / p95 9-14 (51 visible LBs, 1 MDI)
Modern is ~4× slower on CPU at radius=5 because legacy's chunked
pattern already collapsed the scene to one draw. Architectural wins
(zero glBindTexture/frame; constant-cost dispatch as A.5 raises
radius) manifest at higher scene complexity. Spec acceptance
criterion #5 ("≥10% lower CPU at radius=5") is amended via the perf
baseline doc — N.5b ships on visual identity + structural correctness.
Three high-value gotchas captured to memory:
1. `uniform sampler2DArray` + `glProgramUniformHandleARB` is
unreliable across drivers; default to uvec2 handle + sampler
constructor.
2. Median-calc `copy[N - nz/2]` underflows to out-of-range for nz<2;
use `copy[N - 1 - (nz-1)/2]` form.
3. Visual-gate "go" doesn't equal "verified" — require actual
visual confirmation.
Visual verification: confirmed at Holtburg town. 114/114 tests pass
in N.5+N.5b filter. Conformance sentinel max ‖Δ‖ = 0.015 mm across
1000 sample points / 10 representative landblocks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document Phase N.5b shipping (terrain on the modern rendering path via
Path C — `TerrainModernRenderer` mirrors WB's `TerrainRenderManager`
pattern but consumes acdream's `LandblockMesh.Build` so retail's
`FSplitNESW` formula stays in lockstep with physics + visual mesh).
Changes:
- `docs/plans/2026-04-11-roadmap.md` — add N.5b row to the Shipped
table; promote N.5b's "Phases ahead" entry to ✓ SHIPPED with the
Path C resolution + perf reality check; refresh N.6 scope to note
Terrain has joined the modern path (legacy `Texture2D` retirement
scope narrows to Sky + Debug); update top-of-doc Status line.
- `docs/ISSUES.md` — close issue #51 (WB terrain-split formula
divergence). Move from OPEN to "Recently closed" with the Path C
resolution: never adopted WB's formula; modern dispatcher uses
retail's via `LandblockMesh.Build`. References `da56063` (the
black-terrain fix that landed within the N.5b ship chain).
- `CLAUDE.md` — add `TerrainModernRenderer.cs` to the WB integration
cribs list with the GL_INVALID_OPERATION caveat (use uvec2 +
`sampler2DArray(handle)` constructor, NOT direct
`uniform sampler2DArray` + `glProgramUniformHandleARB`). Update
the "Currently in flight" preamble: N.6 builds on N.5 + N.5b;
add an N.5b shipped paragraph linking the perf baseline doc.
- `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` — new doc
capturing the radius=5 Holtburg perf measurement (modern 6.4-7.0
µs median vs legacy 1.5 µs — modern is ~4× SLOWER on CPU at
radius=5). Documents the spec acceptance criterion #5 amendment,
the architectural wins that DO hold (zero glBindTexture/frame,
constant-cost dispatch as A.5 raises radius, per-LB frustum cull),
and the three high-value gotchas surfaced during implementation.
User-memory updates (outside repo, not in this commit):
- `memory/project_phase_n5b_state.md` — full N.5b state file with
the three gotchas captured.
- `memory/MEMORY.md` — index entry pointing at the state file.
Build: dotnet build green. No code changes in this commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deletes:
- TerrainChunkRenderer.cs (454 lines, replaced by TerrainModernRenderer)
- TerrainRenderer.cs (247 lines, older sibling, no production users)
- terrain.vert / terrain.frag (replaced by terrain_modern.{vert,frag})
Removes the temporary Task 8 perf-benchmark toggle (ACDREAM_LEGACY_TERRAIN
env var, _useLegacyTerrain field, parallel _terrainLegacy renderer
instance, [TERRAIN-DIAG/modern|legacy] label suffix). The modern path
is now the only path. Mirror N.5's mandatory-modern amendment: missing
GL_ARB_bindless_texture throws NotSupportedException at startup
(already in place via the BindlessSupport.TryCreate gate).
Three load-bearing research comments preserved verbatim from terrain.vert
into terrain_modern.vert before deletion: the MIN_FACTOR = 0.0 N-dot-L
floor block (cross-ref Lambert brightness split), the aPacked3 bit
layout, the gl_VertexID corner-table 2026-04-21 ConstructPolygons fix.
Also retires the now-orphaned _shader field (legacy terrain pipeline
was its only user).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symptom: terrain renders pure black in modern path (legacy renderer
correct). Diagnostic at TerrainModernRenderer.Draw showed:
glProgramUniformHandle(prog=4, loc=5, handle=0x100251xxx) → GL_INVALID_OPERATION (0x0502)
on both terrain and alpha sampler uniforms.
Root cause: the `uniform sampler2DArray` + glProgramUniformHandleARB
combination is rejected by the NVIDIA Windows driver in this configuration.
The handle is valid and resident; the uniform location is valid; the
program is valid; but the driver refuses to bind a 64-bit handle to a
sampler uniform via the program-uniform path.
Fix: switch to N.5's mesh_modern pattern — pass each 64-bit handle as a
`uniform uvec2` (low + high 32-bit halves) and construct the sampler at
the use site via the GLSL `sampler2DArray(handle)` constructor. This
form is what ARB_bindless_texture documents as universally supported and
is what N.5 already uses successfully.
Files:
- terrain_modern.frag: replace `uniform sampler2DArray uTerrain/uAlpha`
with `uniform uvec2 uTerrainHandle/uAlphaHandle` + `#define`s
- TerrainModernRenderer.cs: cache uvec2 uniform locations; set via
`glProgramUniform2(program, loc, low32, high32)` per frame
- BindlessSupport.cs: remove now-unused `SetSamplerHandleUniform`,
leave a comment noting why the helper was retired
- GameWindow.cs: also strip the temporary [TERRAIN-DBG] cursor-wrap
print added during the perf-baseline investigation
Build green; 114/114 tests in N.5+N.5b filter still pass; user-verified
terrain renders correctly in modern path post-fix. Captured fresh perf
baseline:
- Legacy: cpu_us median 1.5 / p95 3.0 (1 chunk = 1 glDrawElements)
- Modern: cpu_us median 6.4-7.0 / p95 9-14 (51 visible LBs, 1 MDI call)
Modern is ~4× slower on CPU at radius=5 because the chunked legacy path
already collapsed the scene to one draw call. The architectural wins
(zero glBindTexture/frame; constant-cost dispatch as A.5 raises radius)
will be documented in T10's perf baseline doc; the spec's
"≥10% lower CPU" acceptance criterion is invalid at radius=5 and needs
revision.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First diag flush fires ~5s after process start (Environment.TickCount64
threshold), but at that point only 1 sample may have been recorded if
the user is mid-login. The original `copy[copy.Length - nz / 2]` form
underflowed to copy[copy.Length] when nz=1 (nz/2=0), throwing
IndexOutOfRangeException at GameWindow.cs:8799 on the first OnRender
after login.
Fix: use `copy.Length - 1 - (nz - 1) / 2` for median (always >= 0 for
nz >= 1, returns the single sample for nz=1) and clamp the percentile
offset via `(nz - 1) * 0.05` for the same reason.
Caught by user's perf-baseline launch with ACDREAM_LEGACY_TERRAIN=1
(the benchmark toggle from 336ad34). The bug exists in T8 itself
regardless of the toggle.
Build green; existing tests still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an ACDREAM_LEGACY_TERRAIN=1 env var that routes Draw through the
legacy TerrainChunkRenderer instead of the new TerrainModernRenderer.
Both renderers are constructed and fed AddLandblock/RemoveLandblock so
they stay in sync; only one is drawn per frame. The [TERRAIN-DIAG]
log line is labeled /modern or /legacy so the user can tell which
numbers they're capturing.
Removed in Task 9 along with TerrainChunkRenderer.cs, terrain.vert,
and terrain.frag.
Usage:
\$env:ACDREAM_LEGACY_TERRAIN = "1" # legacy mode
\$env:ACDREAM_LEGACY_TERRAIN = \$null # modern mode (default)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Swap TerrainChunkRenderer → TerrainModernRenderer (drop-in: same
AddLandblock/RemoveLandblock/Draw interface). Pass BindlessSupport
to TerrainAtlas.Build so GetBindlessHandles() is callable. Load the
new terrain_modern shader pair and pass to the renderer ctor. Add
[TERRAIN-DIAG] rollup mirroring the existing [WB-DIAG] pattern.
Bindless detection moved above terrain construction so atlas + ctor
can consume BindlessSupport (was previously detected after — order
required for N.5b).
Visual verification at four scenes (Holtburg flat + sloped, Foundry,
sloped landblock) is the next gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review (Important #1): AddLandblock validated Vertices.Length but
not Indices.Length. The indices loop indexes meshData.Indices[0..383]
unconditionally — out-of-range input would throw IndexOutOfRangeException
instead of the clearer ArgumentException the vertex check raises. Today
LandblockMesh.Build always produces 384/384, so this is defensive
forward-compat for future mesh sources.
Code review (Important #2): The shader (terrain_modern.vert:gl_VertexID
% 6) only correctly picks the cell-corner index because we bake
`slot * VertsPerLandblock` into indices and 384 is a multiple of 6.
That invariant is now documented in a comment near the constant — anyone
changing it must audit the shader.
Build green: 0 errors / 0 warnings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The new terrain dispatcher. Single global VBO/EBO with a slot
allocator (one slot per landblock, 384 verts × 40 bytes per slot).
Per-frame: build DEIC array from visible slots, upload, dispatch
via glMultiDrawElementsIndirect. Atlas textures bound via bindless
handles set per-frame as sampler uniforms.
Total ~6-8 GL calls per frame for terrain regardless of visible
landblock count (vs today's per-LB binds at radius=2 → ~25 calls,
radius=5 → ~121 calls).
API mirrors TerrainChunkRenderer so GameWindow integration in T8 is
a drop-in field+ctor swap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review identified a latent false-positive flake risk: physics
path clamps fx = localX/24 to (CellsPerSide - 0.001f) = 7.999, which
corresponds to localX <= 191.976. With samples up to 191.999f,
physics computes Z at the clamped position while the mesh sampler
uses the actual position — a difference of up to 23 mm at the upper
edge, which on a steep slope would falsely trip the 1 mm sentinel.
Tighten upper bound to 191.975f (strictly below the clamp boundary)
so both oracles compute Z at the same (cellX, tx). Also restored the
"worst-case from SplitFormulaDivergenceTest" inline comment for
landblock 0x4D96 per code review suggestion #3.
Test still passes: 10/10 landblocks, 1000 samples, max |delta|
= 0.0153 mm (previously 0.0305 mm — confirms the prior worst-case
was indeed at the boundary).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Z-conformance sentinel for issue #51's bug class. Sweeps 10
representative landblocks x 100 sample points (uniform random in
local 0..192 with fixed seed 42). For each point: compute meshTriZ
via barycentric interpolation in the matching triangle of the
LandblockMesh.Build output; compute physicsZ via
TerrainSurface.SampleZFromHeightmap; assert |delta| < 0.001m.
Catches any silent formula or vertex-layout drift between the
visual and physics paths. Skips gracefully if ACDREAM_DAT_DIR
isn't set (CI without dat data).
Local run with dat data: 10/10 landblocks loaded, 1000 samples,
max |delta| = 0.0305 mm (worst case: Direlands 0xC040).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fragment shader for the modern terrain dispatcher. Bit-identical math
to today's terrain.frag (per-cell maskBlend3 + Phase G fog + lightning
flash). Same #version 460 + GL_ARB_bindless_texture preamble change
as terrain_modern.vert. Sampling syntax unchanged — the bindless-ness
is invisible at the GLSL level.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vertex shader for the modern terrain dispatcher. Bit-identical math
to today's terrain.vert (Phase 3c per-cell mesh + Phase G AdjustPlanes
lighting). The only structural change is the version + bindless
extension preamble — sampler access stays a regular sampler2DArray
uniform; bindless-ness is invisible at the GLSL level.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add optional BindlessSupport ctor parameter + GetBindlessHandles()
method that returns (terrainHandle, alphaHandle) ulongs with both
textures made resident. Two-phase Dispose mirroring TextureCache
(MakeNonResident before DeleteTexture per ARB_bindless_texture spec).
Existing callers pass `Build(gl, dats)` unchanged; bindless = null
default keeps them working until T6/T8 wires the renderer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sweeps all (lbX, lbY, cellX, cellY) tuples for the full 255x255
landblock map (~4.16M cells) and reports both the raw enum-output
disagreement (50.02%) and the diagonal-actually-painted disagreement
(49.98%) between WB's CalculateSplitDirection and acdream's
TerrainBlending.CalculateSplitDirection (which retail uses per
CLandBlockStruct::ConstructPolygons at retail addr 00531d10).
The two formulas behave like independent random hashes. Adopting
WB's pipeline wholesale would mis-render ~half the diagonals on
every landblock (Holtburg 0xA9B0: 29/64 cells = 45.3% wrong). This
data is the foundation for N.5b's Path A vs B vs C decision (kills
Path A).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures everything a fresh agent needs to pick up Phase N.5b (Terrain
on the Modern Rendering Path) without spelunking through the N.5
session history.
Front-loads the load-bearing constraint: issue #51 (WB's terrain split
formula diverges from retail's FSplitNESW). Lays out three viable
design paths (A: adopt WB's formula everywhere; B: keep retail's
formula and fork-patch WB; C: WB mesh layout but our formula). The
brainstorm needs to pick one, informed by quantified divergence rate
across representative landblocks.
Includes file-by-file inventory of acdream's terrain stack (1383 lines
across TerrainRenderer + TerrainChunkRenderer + TerrainAtlas + shaders)
vs WB's (1937 lines across TerrainRenderManager + TerrainGeometryGenerator
+ LandSurfaceManager). Eight brainstorm questions covering atlas model,
mesh ownership, index format, shader unification, streaming integration,
conformance test, and visual verification gate.
Mirrors the N.5 handoff structure that worked well last session:
TL;DR + where N.5 left things + what N.5b inherits + technical detail
+ files to read + brainstorm questions + acceptance criteria + first
30 minutes + things to NOT do.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Records a new Phase A sub-piece: split the single ACDREAM_STREAM_RADIUS
into separate terrain + entity radii so terrain renders to a much
further horizon (WB-style) while entities/scenery stay at the current
closer radius.
Motivated by perf at ACDREAM_STREAM_RADIUS=5 dropping from ~810 fps
to ~200-300 fps because everything stays full-detail. Both retail and
WorldBuilder render terrain way out and strip entities at distance.
Estimate: 3-5 days for the radius split + fog tuning; +1 week if
terrain LOD via mesh decimation is included. Not yet brainstormed.
N.8 (sky + particles via WB's SkyboxRenderManager + ParticleEmitterRenderer)
was already on the roadmap; user confirmed they want it tracked there.
No edit needed for N.8 — already at the right level of detail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reframe the selection-blink follow-up so it doesn't suggest near-term
work. Was listed in N.5 ship record as "Phase B.4 follow-up adds the
field" — now phrased as open backlog with the hook reserved in
mesh_modern.vert's InstanceData comment for whoever eventually picks
it up.
The shader hook itself is unchanged — change is purely doc wording in
the plan SHIP record + CLAUDE.md WB integration cribs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
N.5: Modern Rendering Path. WbDrawDispatcher now uses bindless
textures + glMultiDrawElementsIndirect on top of N.4's grouped
pipeline. Three SSBO uploads + 2 indirect calls per frame, ~12-15
total GL calls for entity rendering regardless of scene complexity.
Measured 1.23 ms / frame median at Holtburg courtyard (1662 groups,
~810 fps). User-gated visual verification PASS at Holtburg.
Includes ship-amendment: legacy renderer path formally retired
(InstancedMeshRenderer + StaticMeshRenderer + WbFoundationFlag
deleted). Bindless is now mandatory; missing extensions throw
NotSupportedException at startup with a clear error message.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Corrects the SHIP commit's acceptance gate verdict on the legacy
escape hatch. The original gate "[x] ACDREAM_USE_WB_FOUNDATION=0
still works" was inaccurate — Task 15's mesh_instanced deletion left
InstancedMeshRenderer orphaned + non-functional. Resolution: formal
retirement of the legacy path within N.5 (the prior commit).
Updated acceptance gate verdict:
- [N/A] ACDREAM_USE_WB_FOUNDATION=0 — escape hatch retired in N.5;
modern path is now mandatory, bindless required at startup. Missing
bindless throws NotSupportedException with a clear error message.
All other gates unchanged from the SHIP commit:
- [x] Visual identity to N.4 — Task 10 + Task 14 USER GATE PASS
- [x] CPU dispatcher time <= 70% of N.4 — measured 1.23 ms/frame at
Holtburg courtyard, comfortably under threshold
- [x] drawsIssued <= 5 per pass (CPU GL calls) — 2 indirect calls/frame
- [x] All tests green — 71/71 in the relevant filter
- [ ] GPU rendering time +-10% of N.4 — DEFERRED (timer query
double-buffering, N.6 follow-up)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final cross-cutting review of N.5 found that Task 15's deletion of
mesh_instanced.vert/.frag left InstancedMeshRenderer orphaned —
ACDREAM_USE_WB_FOUNDATION=0 silently rendered terrain+sky only with
no entities. The SHIP commit's "[x] ACDREAM_USE_WB_FOUNDATION=0 still
works" claim was inaccurate.
Resolution: formal retirement of the legacy renderer path within N.5
instead of deferring to N.6.
Deleted:
- src/AcDream.App/Rendering/InstancedMeshRenderer.cs
- src/AcDream.App/Rendering/StaticMeshRenderer.cs
- src/AcDream.App/Rendering/Wb/WbFoundationFlag.cs
GameWindow simplified — capability detection is unconditional, missing
bindless throws NotSupportedException with a clear message at startup.
WbDrawDispatcher + mesh_modern shader load are mandatory after init.
No escape hatch.
GpuWorldState simplified — WbFoundationFlag.IsEnabled guards on
AddLandblock/RemoveLandblock removed; adapter calls are unconditional
when the adapter is non-null.
PendingSpawnIntegrationTests updated — WbFoundationFlag.ForTestsOnly_ForceEnable
static ctor removed (flag is gone; adapter calls are unconditional).
The ApplyLoadedTerrain physics-data loop was also simplified: the
EnsureUploaded sub-loop that fed InstancedMeshRenderer is gone;
_pendingCellMeshes is now explicitly cleared to prevent unbounded
accumulation (the worker thread still populates it, but WB handles
EnvCell geometry through its own pipeline).
Spec §2 Decision 5 + §10 Out-of-Scope updated. Plan ship-amendment
section added. Roadmap updated (N.5 ships with retirement; N.6 scope
narrowed to perf-only). CLAUDE.md "WB integration cribs" updated.
Perf baseline doc updated. WbDrawDispatcher class summary docstring
corrected to describe the as-shipped SSBO + multi-draw-indirect path.
ISSUES.md #51 updated (terrain not in N.5 scope; deferred to N.7).
Bindless support is now a hard requirement. Modern desktop GPUs
universally expose GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters;
if a user hits the NotSupportedException, that's a real bug report
worth investigating, not a silent fallback.
Build: 0 errors, 0 warnings. Tests: 71/71 (Wb+MatrixComposition+TextureCacheBindless filter).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bindless textures + glMultiDrawElementsIndirect on top of N.4's grouped
pipeline. Per-frame entity rendering: 3 SSBO uploads (instance matrices
@ binding=0, batch data @ binding=1, indirect commands) + 2 indirect
calls (opaque + transparent). Total ~12-15 GL calls per frame for entity
rendering, regardless of scene complexity.
Acceptance gates (spec §8.3):
- [x] Visual identity to N.4 — Task 10 USER GATE PASS (Holtburg courtyard)
+ Task 14 USER GATE PASS (general roaming, no regressions seen)
- [x] CPU dispatcher time ≤ 70% of N.4 — measured 1.23 ms/frame median
at Holtburg courtyard (1662 groups, ~810 fps); estimated N.4
hot path ≥2.5 ms/frame; comfortably under threshold
- [x] drawsIssued ≤ 5 per pass (CPU GL calls) — exactly 2 indirect calls
per frame regardless of scene size
- [x] All tests green — 71/71 in
FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless
- [x] ACDREAM_USE_WB_FOUNDATION=0 still works — InstancedMeshRenderer
escape hatch preserved (its own shader path, untouched)
- [ ] GPU rendering time within ±10% of N.4 — DEFERRED to N.6.
GL_TIME_ELAPSED query polling never reports avail!=1 within the
same frame; needs double-buffering. CPU is the load-bearing metric.
Plan amendments captured during execution:
- Task 2: parallel Texture2DArray upload path (replacing the original
"switch globally" framing that would've broken 4 legacy consumers)
- Task 3+4: parallel bindless cache dictionaries (avoiding the GLSL
type mismatch from sampling a Texture2D handle via sampler2DArray)
- Task 5: preserved mesh_instanced.frag's full SceneLighting UBO + 8
lights + fog + lightning flash + per-channel clamp
- Task 9: BatchDataPublic Pack=8 (required for safe MemoryMarshal.Cast)
Plan archived at:
docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Spec at:
docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md
Perf baseline at:
docs/plans/2026-05-08-phase-n5-perf-baseline.md
Memory at:
~/.claude/.../memory/project_phase_n5_state.md
Files changed: 6 added, 6 modified, 2 deleted. 19 tasks shipped across
~40 commits including amendments + fixups + reviews.
N.6 follow-ups: retire InstancedMeshRenderer entirely; GPU timer query
double-buffering; persistent-mapped buffers if profiling shows the
residual glBufferData hot spot; possible WB atlas adoption for memory
savings on shared content; possible GPU-side culling via compute pre-pass;
per-instance highlight (selection blink) for retail-faithful click feedback
(field reserved in mesh_modern.vert's InstanceData struct).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves N.5 from in-flight to Shipped (2026-05-08). N.6 (retire
InstancedMeshRenderer + perf polish) becomes the in-flight phase.
CLAUDE.md in-flight pointer updated to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Records the as-shipped state: acceptance gate verdicts, plan amendments
captured during execution, code-review adjustments per task, out-of-scope
N.6 follow-ups, and a complete files-changed summary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mesh_instanced.vert + .frag deleted. WbDrawDispatcher always uses
mesh_modern when WB foundation is on. Legacy escape hatch
(ACDREAM_USE_WB_FOUNDATION=0 or bindless missing) runs through
InstancedMeshRenderer which has its own shader path — untouched.
GameWindow's else-branch removed; if bindless is missing, _meshShader
stays unloaded, _wbDrawDispatcher stays null, and _staticMesh is not
constructed (its guard requires _meshShader non-null). All downstream
_staticMesh usages were already null-safe (null-conditional operators
or explicit null guards). Two null-forgiving suppressors added at the
WbDrawDispatcher + SkyRenderer construction sites where the compiler
couldn't prove non-null but the logic guarantees it (both require
_bindlessSupport non-null, which implies _meshShader was assigned;
_textureCache is assigned unconditionally).
InstancedMeshRenderer.cs: the one reference to mesh_instanced was
a code comment (location 3 NOT used by mesh_instanced.vert) — not
a file load. Escape hatch code path is preserved; the shader comment
is now stale but low priority.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CPU dispatcher: 1227 µs / frame median (1303 µs p95) at Holtburg
courtyard, 1662 groups in working set. Inferred ~810 fps sustained.
CPU dispatcher acceptance gate (≤70% of N.4): PASS — N.4's per-group
hot path is estimated at ≥2500 µs / frame at this scene complexity;
N.5 is comfortably under half.
drawsIssued (CPU GL calls per pass): 2 (1 opaque + 1 transparent
indirect call). Down from N.4's ~hundreds per pass. PASS.
GPU timing: unmeasured. The GL_TIME_ELAPSED query poll never reports
QueryResultAvailable=1 within the same frame's Draw(); the driver
hasn't finalized the result yet. Fix is double-buffering (queryA
on frame N, read on N+2). Deferred to N.6 perf polish — doesn't block
N.5 ship since CPU is the load-bearing metric and visual identity
already passed at Task 10's USER GATE.
Direct N.4 baseline NOT measured. Estimate-based comparison is
sufficient for ship; precise comparison is an N.6 follow-up.
Baseline doc at docs/plans/2026-05-08-phase-n5-perf-baseline.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
queries (opaque + transparent) wrapping each glMultiDrawElementsIndirect,
polled non-blocking via QueryResultAvailable on the next frame.
Sample window is 256 frames per signal; median + p95 reported.
Numbers populate the SHIP commit's perf table at Task 19.
Silk.NET naming note: GL_TIME_ELAPSED queries use QueryTarget.TimeElapsed
(confirmed present in Silk.NET.OpenGL 2.23.0 DLL). The 64-bit result is
read via GetQueryObject(..., out ulong) which dispatches to
glGetQueryObjectui64v; the int overload (glGetQueryObjectiv) is used for
the ResultAvailable poll, matching WorldBuilder's VisibilityManager pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Locks in Decision 2 (Opaque + ClipMap → opaque indirect; AlphaBlend +
Additive + InvAlpha → transparent indirect). Catches future refactors
that drift the partition — silent visual regression otherwise (groups
rendered in the wrong pass with the wrong blend state).
Adds public static IsOpaquePublic shim on WbDrawDispatcher; the
underlying IsOpaque stays private.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
Per-frame uploads three SSBOs:
- _instanceSsbo @ binding=0 (mat4 per instance, indexed by gl_BaseInstanceARB + gl_InstanceID)
- _batchSsbo @ binding=1 (BatchData per group, indexed by gl_DrawIDARB)
- _indirectBuffer (DrawElementsIndirectCommand[] — opaque first, transparent second)
GameWindow swaps the shader load to mesh_modern when _bindlessSupport
is non-null. Capability detection + shader load now run in the right
order (capability before TextureCache + before Shader).
Deletes the obsolete DrawGroup stub, EnsureInstanceAttribs, _instanceBuffer,
_patchedVaos. ClassifyBatches + ResolveTexture already migrated in
Task 8 to use ulong bindless handles.
BuildIndirectArrays (Task 9) wired in: _opaqueDraws + _translucentDraws
are flattened into IndirectGroupInput[], laid out via the helper into
contiguous indirect commands + parallel BatchData[]. opaqueByteOffset=0,
transparentByteOffset = opaqueCount × DrawCommandStride.
Visual verification (USER GATE) PASS: Holtburg courtyard renders
identical to N.4 — terrain, scenery, characters, NPCs all visible
without artifacts. [N.5] modern path capabilities present + mesh_modern
shader loaded log lines confirm the boot path. [WB-DIAG] hot-path
counters show healthy entity/draw activity.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code quality review caught:
- sizeofDEIC was a local; promoted to public const DrawCommandStride
so tests can reference it symbolically.
- BatchDataPublic layout invariant (size + field offsets) wasn't
asserted in tests. Added BatchDataPublic_LayoutMatchesPrivateBatchData
+ DrawCommandStride_MatchesStructSize tests to gate Task 10's
MemoryMarshal.Cast<BatchData, BatchDataPublic> safety.
- Plan doc updated: BatchDataPublic spec was Pack=4 (wrong — must
match private BatchData's Pack=8 for the cast to work). Implementation
was already correct; plan now matches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure CPU helper that lays out a group list into a contiguous indirect
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
opaque section first, transparent section second. Returns counts +
byte offset for the transparent section.
Tests cover: spec §5 walk-through layout; empty group list edge case;
ClipMap classification (treated as opaque, not transparent).
Static + public so tests can exercise without a GL context. Task 10
wires it into the rewritten Draw() method.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>