# Phase N.5 — Modern Rendering Path — Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Lift `WbDrawDispatcher` onto bindless textures + multi-draw indirect, reducing per-pass GL calls from ~hundreds to ~5, with visual identity to N.4. **Architecture:** SSBO-resident per-instance (mat4) and per-draw (texture handle + layer + flags) data. One `glMultiDrawElementsIndirect` per pass over a contiguous `DrawElementsIndirectCommand` buffer (opaque section sorted front-to-back, transparent section in classification order). 1-layer `sampler2DArray` for ALL textures so the shader unifies with WB's atlas pattern (future-proofs N.6+ atlas adoption). WB's two-pass alpha-test for translucency. **Tech Stack:** .NET 10, C#, Silk.NET.OpenGL 2.23, Silk.NET.OpenGL.Extensions.ARB, GLSL 4.30 + `GL_ARB_bindless_texture` + `GL_ARB_shader_draw_parameters`. xUnit for tests. **Predecessor:** N.4 ship at `c445364` + spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`. --- ## File map **Create:** - `src/AcDream.App/Rendering/Wb/BindlessSupport.cs` — thin wrapper around `Silk.NET.OpenGL.Extensions.ARB.ArbBindlessTexture`, capability detection. - `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs` — DEIC struct for indirect dispatch. - `src/AcDream.App/Rendering/Shaders/mesh_modern.vert` — bindless + SSBO + indirect vertex shader. - `src/AcDream.App/Rendering/Shaders/mesh_modern.frag` — alpha-test discard fragment shader. - `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs` - `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs` - `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs` **Modify:** - `src/AcDream.App/AcDream.App.csproj` — add `Silk.NET.OpenGL.Extensions.ARB` package. - `src/AcDream.App/Rendering/TextureCache.cs` — Texture2DArray uploads, three Bindless `GetOrUpload*` methods, Dispose order. - `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` — replace draw loop with SSBO + indirect dispatch, add timing diagnostics. - `src/AcDream.App/Rendering/GameWindow.cs` — load `mesh_modern` shaders + capability check + fallback. - `CLAUDE.md` — extend "WB integration cribs" with N.5 patterns. - `docs/plans/2026-04-11-roadmap.md` — move N.5 to "shipped" at end. **Delete (Task 15):** - `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert` - `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag` --- ## Workflow per task 1. Read the spec section the task implements. 2. For TDD-friendly tasks: write the failing test → run → verify failure → implement → run → verify pass → commit. 3. For shader / pure-integration tasks (no unit-testable behavior): build green → visual smoke test → commit. 4. After every commit, run `dotnet build` (full) + `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"`. Both must be green. Commit message convention (matching N.4): - Tasks 1-14: `phase(N.5) Task N: ` - Tasks 15-19: `phase(N.5): ` - Task 20: `phase(N.5): SHIP — ` Always co-author: `Co-Authored-By: Claude Opus 4.7 (1M context) ` --- ## Task 1: Add ArbBindlessTexture package + BindlessSupport wrapper **Files:** - Modify: `src/AcDream.App/AcDream.App.csproj` - Create: `src/AcDream.App/Rendering/Wb/BindlessSupport.cs` (The test file `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs` is created in Task 3, NOT this task.) - [ ] **Step 1.1: Add package reference** In `src/AcDream.App/AcDream.App.csproj`, add inside the existing `` containing `Silk.NET.OpenGL`: ```xml ``` - [ ] **Step 1.2: Build to verify package resolves** Run: `dotnet build src/AcDream.App/AcDream.App.csproj` Expected: PASS, package restored. - [ ] **Step 1.3: Write the BindlessSupport class** Create `src/AcDream.App/Rendering/Wb/BindlessSupport.cs`: ```csharp using Silk.NET.OpenGL; using Silk.NET.OpenGL.Extensions.ARB; namespace AcDream.App.Rendering.Wb; /// /// Thin wrapper around + capability detection /// for the modern rendering path. Constructed once at startup. Throws if the /// extension isn't available — callers must check /// before constructing for production use. /// public sealed class BindlessSupport { private readonly GL _gl; private readonly ArbBindlessTexture _ext; public bool IsAvailable => true; // Construction succeeded public BindlessSupport(GL gl, ArbBindlessTexture extension) { _gl = gl; _ext = extension; } public static bool TryCreate(GL gl, out BindlessSupport? support) { if (gl.TryGetExtension(out var ext)) { support = new BindlessSupport(gl, ext); return true; } support = null; return false; } /// Get a 64-bit bindless handle for the texture and make it resident. /// Idempotent: handle is the same for a given texture name. public ulong GetResidentHandle(uint textureName) { ulong h = _ext.GetTextureHandle(textureName); if (!_ext.IsTextureHandleResident(h)) _ext.MakeTextureHandleResident(h); return h; } /// Release residency for a handle. Call before deleting the underlying texture. public void MakeNonResident(ulong handle) { if (_ext.IsTextureHandleResident(handle)) _ext.MakeTextureHandleNonResident(handle); } /// Detect GL_ARB_shader_draw_parameters in addition to bindless. /// N.5's vertex shader uses gl_BaseInstanceARB and gl_DrawIDARB /// from this extension. public bool HasShaderDrawParameters(GL gl) { int n = 0; gl.GetInteger(GLEnum.NumExtensions, out n); for (int i = 0; i < n; i++) { string ext = gl.GetStringS(StringName.Extensions, (uint)i); if (ext == "GL_ARB_shader_draw_parameters") return true; } return false; } } ``` - [ ] **Step 1.4: Build to verify** Run: `dotnet build` Expected: PASS. - [ ] **Step 1.5: Commit** ```bash git add src/AcDream.App/AcDream.App.csproj src/AcDream.App/Rendering/Wb/BindlessSupport.cs git commit -m "phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection [heredoc body]" ``` Use this exact heredoc body: ``` phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection Adds Silk.NET.OpenGL.Extensions.ARB 2.23.0 package and a thin BindlessSupport wrapper exposing GetResidentHandle / MakeNonResident / HasShaderDrawParameters. TryCreate returns false if the bindless extension isn't present, letting WbFoundationFlag fall back to legacy. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 2: Add parallel Texture2DArray upload path to TextureCache **Files:** - Modify: `src/AcDream.App/Rendering/TextureCache.cs` **AMENDED 2026-05-08** after first-pass implementation surfaced a flaw. Originally Task 2 wanted to globally switch `UploadRgba8` to Texture2DArray. Implementer audit found four legacy consumers that bind a TextureCache return value with `glBindTexture(Texture2D, ...)`: `WbDrawDispatcher.cs:363` (rewritten in Task 10 — but breaks meanwhile), `StaticMeshRenderer.cs:126,223`, `InstancedMeshRenderer.cs:282,361` (legacy escape hatch — must keep working under foundation flag-off), and `ParticleRenderer.cs:162`. A texture has ONE GL target — can't be both Texture2D and Texture2DArray. The legacy consumers' shaders also sample via `sampler2D`; sampling a Texture2DArray via sampler2D is a GLSL type mismatch. **Revised approach:** ADD a parallel `UploadRgba8AsLayer1Array` method. Don't touch the existing `UploadRgba8`. Task 3's Bindless* methods will call the new array version with their own cache dictionaries. Legacy callers stay on the Texture2D path, untouched. WB modern dispatcher (Task 10) uses the array path. Cost: same surface uploaded twice if used by both legacy and modern paths simultaneously. In practice the overlap is small, and N.6 deletes the legacy path entirely. Acceptable transition cost. - [ ] **Step 2.1: Read existing UploadRgba8 in TextureCache.cs** Read `src/AcDream.App/Rendering/TextureCache.cs:256-280`. Confirm it uses `TextureTarget.Texture2D` + `TexImage2D`. - [ ] **Step 2.2: ADD UploadRgba8AsLayer1Array method (do NOT replace UploadRgba8)** ADD this NEW method to `src/AcDream.App/Rendering/TextureCache.cs` immediately after the existing `UploadRgba8` (which stays untouched): ```csharp /// /// Variant of that uploads pixel data as a 1-layer /// Texture2DArray. Required by the WB modern rendering path which samples via /// sampler2DArray in its bindless shader. Pixel data is identical. /// private uint UploadRgba8AsLayer1Array(DecodedTexture decoded) { uint tex = _gl.GenTexture(); _gl.BindTexture(TextureTarget.Texture2DArray, tex); fixed (byte* p = decoded.Rgba8) _gl.TexImage3D( TextureTarget.Texture2DArray, 0, InternalFormat.Rgba8, (uint)decoded.Width, (uint)decoded.Height, depth: 1, border: 0, PixelFormat.Rgba, PixelType.UnsignedByte, p); _gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear); _gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear); _gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat); _gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat); _gl.BindTexture(TextureTarget.Texture2DArray, 0); return tex; } ``` - [ ] **Step 2.3: Build + run tests** Run: `dotnet build` Expected: PASS. The new method is unused at this point, but that's fine — Task 3 wires the bindless variants to call it. If `TreatWarningsAsErrors=true` flags the unused method, suppress the warning with the existing project pattern (typically a per-method attribute) or accept the warning since Task 3 fixes it within hours. Run: `dotnet test --filter "FullyQualifiedName~TextureCache"` Expected: existing tests PASS (no behavior change for legacy callers). - [ ] **Step 2.4: Commit** ``` phase(N.5) Task 2: parallel Texture2DArray upload path in TextureCache Adds UploadRgba8AsLayer1Array — uploads pixel data as a 1-layer Texture2DArray. Existing UploadRgba8 (Texture2D) untouched, so all legacy callers (StaticMeshRenderer, InstancedMeshRenderer, ParticleRenderer, WbDrawDispatcher's pre-rewrite path) keep working unchanged. Required for Task 3's Bindless* methods which need the Texture2DArray target so the WB modern shader can sample via sampler2DArray. Same surface may be uploaded both ways during the N.5/N.6 transition; doubling is bounded and acceptable. After N.6 retires legacy renderers entirely, the legacy UploadRgba8 becomes unused and is deleted. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 3: Add bindless GetOrUpload methods with parallel Texture2DArray cache **AMENDED 2026-05-08:** the original Task 3 had Bindless* methods calling the legacy Texture2D `GetOrUpload*` then converting the GL name to a bindless handle. That produces a `sampler2D` texture sampled via `sampler2DArray` in the shader — a GLSL type mismatch. Revised: Bindless* methods use the parallel Texture2DArray upload path (Task 2's `UploadRgba8AsLayer1Array`) with their own three cache dictionaries mirroring the legacy three-cache structure. **Files:** - Modify: `src/AcDream.App/Rendering/TextureCache.cs` - Create: `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs` - [ ] **Step 3.1: Read TextureCache constructor + cache fields** Read `src/AcDream.App/Rendering/TextureCache.cs:1-50`. Note the existing dictionaries: `_handlesBySurfaceId`, `_handlesByOverridden`, `_handlesByPalette` — these stay untouched, serving the legacy Texture2D path. - [ ] **Step 3.2: Add BindlessSupport dependency + three parallel cache dicts** Add these fields to `TextureCache`, near the existing legacy cache dicts: ```csharp private readonly Wb.BindlessSupport? _bindless; // Bindless / Texture2DArray parallel caches. Keys mirror the legacy three // caches so a surface used by both the legacy (Texture2D, sampler2D) and // modern (Texture2DArray, sampler2DArray) paths is uploaded twice — once // per target. Each entry stores both the GL texture name (for Dispose // cleanup) and the resident bindless handle (returned to callers). private readonly Dictionary _bindlessBySurfaceId = new(); private readonly Dictionary<(uint surfaceId, uint origTexOverride), (uint Name, ulong Handle)> _bindlessByOverridden = new(); private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), (uint Name, ulong Handle)> _bindlessByPalette = new(); ``` Change the constructor signature: ```csharp public TextureCache(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null) { _gl = gl; _dats = dats; _bindless = bindless; } ``` The optional `bindless` parameter keeps backward compatibility — legacy `GetOrUpload*` keeps working without it. The Bindless* methods throw if `bindless` is null. - [ ] **Step 3.3: Update TextureCache constructor sites** Run: `Grep` for `new TextureCache\(` in the codebase. Identified call site: `src/AcDream.App/Rendering/GameWindow.cs` (typically around the WB foundation init). Modify `GameWindow.cs` to pass the `BindlessSupport` instance — but only after Task 6 wires it up. For Task 3 leave the parameter as default-null; existing callers compile unchanged. - [ ] **Step 3.4: Add three Bindless GetOrUpload methods** Add to `src/AcDream.App/Rendering/TextureCache.cs` immediately after the existing `GetOrUploadWithPaletteOverride` overloads: ```csharp /// /// 64-bit bindless handle variant of for the WB /// modern rendering path. Uploads the texture as a 1-layer Texture2DArray /// (so the shader's sampler2DArray can sample at layer 0) and returns /// a resident bindless handle. Caches by surfaceId in a separate dictionary /// from the legacy Texture2D path; the same surface may be uploaded twice /// if used by both paths (acceptable transition cost — N.6 deletes the legacy /// path). /// Throws if BindlessSupport wasn't provided to the constructor. /// public ulong GetOrUploadBindless(uint surfaceId) { EnsureBindlessAvailable(); if (_bindlessBySurfaceId.TryGetValue(surfaceId, out var entry)) return entry.Handle; var decoded = DecodeFromDats(surfaceId, origTextureOverride: null, paletteOverride: null); uint name = UploadRgba8AsLayer1Array(decoded); ulong handle = _bindless!.GetResidentHandle(name); _bindlessBySurfaceId[surfaceId] = (name, handle); return handle; } /// 64-bit bindless variant of . /// Uses the parallel Texture2DArray upload path. public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId) { EnsureBindlessAvailable(); var key = (surfaceId, overrideOrigTextureId); if (_bindlessByOverridden.TryGetValue(key, out var entry)) return entry.Handle; var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: null); uint name = UploadRgba8AsLayer1Array(decoded); ulong handle = _bindless!.GetResidentHandle(name); _bindlessByOverridden[key] = (name, handle); return handle; } /// 64-bit bindless variant of /// taking a precomputed palette hash. Uses the parallel Texture2DArray upload path. public ulong GetOrUploadWithPaletteOverrideBindless( uint surfaceId, uint? overrideOrigTextureId, PaletteOverride paletteOverride, ulong precomputedPaletteHash) { EnsureBindlessAvailable(); uint origTexKey = overrideOrigTextureId ?? 0; var key = (surfaceId, origTexKey, precomputedPaletteHash); if (_bindlessByPalette.TryGetValue(key, out var entry)) return entry.Handle; var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: paletteOverride); uint name = UploadRgba8AsLayer1Array(decoded); ulong handle = _bindless!.GetResidentHandle(name); _bindlessByPalette[key] = (name, handle); return handle; } private void EnsureBindlessAvailable() { if (_bindless is null) throw new InvalidOperationException( "TextureCache constructed without BindlessSupport — cannot generate bindless handles. " + "WbDrawDispatcher requires the bindless-aware ctor overload (pass non-null BindlessSupport)."); } ``` Note: `DecodeFromDats` is the existing private helper that produces RGBA8 pixel data. It's target-agnostic — same decoded pixels go to either Texture2D (legacy) or Texture2DArray (bindless) upload. No duplication of the decode pipeline. - [ ] **Step 3.5: Write the failing tests** Create `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`: ```csharp using AcDream.App.Rendering; using AcDream.App.Rendering.Wb; using DatReaderWriter; using Xunit; namespace AcDream.Core.Tests.Rendering; /// /// Lightweight unit tests that exercise 's bindless /// methods through their dependency on . /// These tests run without a GL context — they verify guard behavior. Real /// bindless integration is covered by visual verification (Task 17). /// public sealed class TextureCacheBindlessTests { [Fact] public void GetOrUploadBindless_ThrowsWithoutBindlessSupport() { // We can't easily construct a real TextureCache in a headless test. // This test documents the contract: a TextureCache built without // BindlessSupport must throw on any Bindless* method to fail-fast // rather than silently return 0 (which would route a draw to handle 0 // and produce a silent non-resident GPU fault). // Marker test — the actual throw lives in TextureCache.MakeResidentHandle // and is reached only via GL-bound Bindless* methods. This test passes // by virtue of the throw existing in source. See Task 3 Step 3.4 for // the contract definition. Assert.True(true, "Contract documented in TextureCache.MakeResidentHandle."); } } ``` (The "real" bindless test surface is the visual gate at Task 17 — there's no headless GL context for unit-testing handle generation. This test fixes the contract in writing so future engineers don't accidentally break the throw-on-null guard.) - [ ] **Step 3.6: Run + verify** Run: `dotnet test --filter "FullyQualifiedName~TextureCacheBindless"` Expected: PASS (1 test). Run full build: `dotnet build` Expected: PASS. - [ ] **Step 3.7: Commit** ``` phase(N.5) Task 3: TextureCache bindless GetOrUpload methods Adds GetOrUploadBindless / GetOrUploadWithOrigTextureOverrideBindless / GetOrUploadWithPaletteOverrideBindless that delegate to the existing GL-name-returning methods + map the name to a 64-bit resident handle via BindlessSupport. Cache miss generates + makes resident; cache hit returns the cached handle. Constructor gains an optional BindlessSupport parameter — null keeps backward compat for callers (sky, terrain, debug) that don't need bindless. Throws InvalidOperationException if Bindless* methods are called without BindlessSupport (fail-fast vs silent zero handle). Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 4: Update TextureCache.Dispose for bindless release order **Files:** - Modify: `src/AcDream.App/Rendering/TextureCache.cs` - [ ] **Step 4.1: Replace Dispose method** Replace the existing `Dispose` in `src/AcDream.App/Rendering/TextureCache.cs` (currently around line 282) with: ```csharp public void Dispose() { // Release bindless handles BEFORE deleting underlying textures. // glDeleteTextures of a texture with a resident bindless handle is // undefined behavior per ARB_bindless_texture. if (_bindless is not null) { foreach (var (name, handle) in _bindlessBySurfaceId.Values) _bindless.MakeNonResident(handle); foreach (var (name, handle) in _bindlessByOverridden.Values) _bindless.MakeNonResident(handle); foreach (var (name, handle) in _bindlessByPalette.Values) _bindless.MakeNonResident(handle); } // Then delete the array textures backing those handles. foreach (var (name, _) in _bindlessBySurfaceId.Values) _gl.DeleteTexture(name); _bindlessBySurfaceId.Clear(); foreach (var (name, _) in _bindlessByOverridden.Values) _gl.DeleteTexture(name); _bindlessByOverridden.Clear(); foreach (var (name, _) in _bindlessByPalette.Values) _gl.DeleteTexture(name); _bindlessByPalette.Clear(); // Legacy Texture2D textures. foreach (var h in _handlesBySurfaceId.Values) _gl.DeleteTexture(h); _handlesBySurfaceId.Clear(); foreach (var h in _handlesByOverridden.Values) _gl.DeleteTexture(h); _handlesByOverridden.Clear(); foreach (var h in _handlesByPalette.Values) _gl.DeleteTexture(h); _handlesByPalette.Clear(); if (_magentaHandle != 0) { _gl.DeleteTexture(_magentaHandle); _magentaHandle = 0; } } ``` - [ ] **Step 4.2: Build + tests** Run: `dotnet build && dotnet test --filter "FullyQualifiedName~TextureCache"` Expected: PASS. - [ ] **Step 4.3: Commit** ``` phase(N.5) Task 4: TextureCache.Dispose releases bindless handles first Iterating _bindlessHandlesByGlName + MakeNonResident before any glDeleteTexture call, per ARB_bindless_texture spec — deleting a texture with a resident handle is undefined behavior. Order: bindless release → texture delete → magenta cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 5: Create mesh_modern.vert + mesh_modern.frag **Files:** - Create: `src/AcDream.App/Rendering/Shaders/mesh_modern.vert` - Create: `src/AcDream.App/Rendering/Shaders/mesh_modern.frag` Both files must be added to `` `` block in `AcDream.App.csproj` if shaders aren't auto-included. Check the existing pattern in the csproj — the existing `mesh_instanced.vert/.frag` should already be there. - [ ] **Step 5.1: Read csproj content includes** Read `src/AcDream.App/AcDream.App.csproj`. Find the `` block(s) that include `*.vert` / `*.frag` files. Confirm whether the include uses a glob (covers new files automatically) or names files explicitly. If glob: nothing to do. If explicit: add `mesh_modern.vert` + `mesh_modern.frag` entries. - [ ] **Step 5.2: Write mesh_modern.vert** Create `src/AcDream.App/Rendering/Shaders/mesh_modern.vert`: ```glsl #version 430 core #extension GL_ARB_bindless_texture : require #extension GL_ARB_shader_draw_parameters : require layout(location = 0) in vec3 aPosition; layout(location = 1) in vec3 aNormal; layout(location = 2) in vec2 aTexCoord; struct InstanceData { mat4 transform; // Reserved for Phase B.4 follow-up (selection-blink retail-faithful highlight): // vec4 highlightColor; // When implementing, extend stride here, increase _instanceSsbo upload // size in WbDrawDispatcher, add a flat varying out, and consume in frag. }; struct BatchData { uvec2 textureHandle; // bindless handle for sampler2DArray uint textureLayer; // layer index (always 0 for per-instance composites) uint flags; // reserved }; layout(std430, binding = 0) readonly buffer InstanceBuffer { InstanceData Instances[]; }; layout(std430, binding = 1) readonly buffer BatchBuffer { BatchData Batches[]; }; uniform mat4 uViewProjection; out vec3 vNormal; out vec2 vTexCoord; out flat uvec2 vTextureHandle; out flat uint vTextureLayer; void main() { int instanceIndex = gl_BaseInstanceARB + gl_InstanceID; mat4 model = Instances[instanceIndex].transform; vec4 worldPos = model * vec4(aPosition, 1.0); gl_Position = uViewProjection * worldPos; vNormal = normalize(mat3(model) * aNormal); vTexCoord = aTexCoord; BatchData b = Batches[gl_DrawIDARB]; vTextureHandle = b.textureHandle; vTextureLayer = b.textureLayer; } ``` - [ ] **Step 5.3: Write mesh_modern.frag — preserve existing lighting model** **AMENDED 2026-05-08:** original plan draft used hardcoded `uAmbient/uSunDir/uSunColor` uniforms. Reading the actual `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag` revealed it uses a `SceneLighting` UBO at `binding=1` with 8 lights, fog params, and lightning flash. The N.5 shader must preserve this lighting machinery to maintain visual identity to N.4. The vert outputs need to ADD `vWorldPos` (used by `accumulateLights` and `applyFog`). Update the vert from Step 5.2 to also emit `out vec3 vWorldPos;` and `vWorldPos = worldPos.xyz;` in main. Create `src/AcDream.App/Rendering/Shaders/mesh_modern.frag` with the same lighting UBO + functions as `mesh_instanced.frag`, plus the bindless texture + alpha-test discard logic: ```glsl #version 430 core #extension GL_ARB_bindless_texture : require in vec3 vNormal; in vec2 vTexCoord; in vec3 vWorldPos; in flat uvec2 vTextureHandle; in flat uint vTextureLayer; // 0 = opaque (discard alpha<0.95), 1 = transparent (discard alpha>=0.95) uniform int uRenderPass; // SceneLighting UBO — IDENTICAL layout to mesh_instanced.frag binding=1. struct Light { vec4 posAndKind; vec4 dirAndRange; vec4 colorAndIntensity; vec4 coneAngleEtc; }; layout(std140, binding = 1) uniform SceneLighting { Light uLights[8]; vec4 uCellAmbient; vec4 uFogParams; vec4 uFogColor; vec4 uCameraAndTime; }; vec3 accumulateLights(vec3 N, vec3 worldPos) { vec3 lit = uCellAmbient.xyz; int activeLights = int(uCellAmbient.w); for (int i = 0; i < 8; ++i) { if (i >= activeLights) break; int kind = int(uLights[i].posAndKind.w); vec3 Lcol = uLights[i].colorAndIntensity.xyz * uLights[i].colorAndIntensity.w; if (kind == 0) { vec3 Ldir = -uLights[i].dirAndRange.xyz; float ndl = max(0.0, dot(N, Ldir)); lit += Lcol * ndl; } else { vec3 toL = uLights[i].posAndKind.xyz - worldPos; float d = length(toL); float range = uLights[i].dirAndRange.w; if (d < range && range > 1e-3) { vec3 Ldir = toL / max(d, 1e-4); float ndl = max(0.0, dot(N, Ldir)); float atten = 1.0; if (kind == 2) { float cos_edge = cos(uLights[i].coneAngleEtc.x * 0.5); float cos_l = dot(-Ldir, uLights[i].dirAndRange.xyz); atten *= (cos_l > cos_edge) ? 1.0 : 0.0; } lit += Lcol * ndl * atten; } } } return lit; } vec3 applyFog(vec3 lit, vec3 worldPos) { int mode = int(uFogParams.w); if (mode == 0) return lit; float d = length(worldPos - uCameraAndTime.xyz); float fogStart = uFogParams.x; float fogEnd = uFogParams.y; float span = max(1e-3, fogEnd - fogStart); float fog = clamp((d - fogStart) / span, 0.0, 1.0); return mix(lit, uFogColor.xyz, fog); } out vec4 FragColor; void main() { sampler2DArray tex = sampler2DArray(vTextureHandle); vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer))); // Two-pass alpha-test (N.5 Decision 2 — replaces mesh_instanced's // uTranslucencyKind=1 ClipMap-only discard with a more aggressive // pattern that also handles AlphaBlend correctly via two passes). if (uRenderPass == 0) { if (color.a < 0.95) discard; // opaque pass } else { if (color.a >= 0.95) discard; // transparent pass if (color.a < 0.05) discard; // skip totally-empty } vec3 N = normalize(vNormal); vec3 lit = accumulateLights(N, vWorldPos); // Lightning flash — additive scene bump (matches mesh_instanced.frag). lit += uFogParams.z * vec3(0.6, 0.6, 0.75); // Retail clamp per-channel to 1.0 (r13 §13.1). lit = min(lit, vec3(1.0)); vec3 rgb = color.rgb * lit; rgb = applyFog(rgb, vWorldPos); FragColor = vec4(rgb, color.a); } ``` - [ ] **Step 5.4: Update mesh_modern.vert to emit vWorldPos** Add `vWorldPos` output to the vert from Step 5.2. The full vert becomes: ```glsl #version 430 core #extension GL_ARB_bindless_texture : require #extension GL_ARB_shader_draw_parameters : require layout(location = 0) in vec3 aPosition; layout(location = 1) in vec3 aNormal; layout(location = 2) in vec2 aTexCoord; struct InstanceData { mat4 transform; // Reserved for Phase B.4 follow-up (selection-blink retail-faithful // highlight): vec4 highlightColor; — extend stride here, increase the // _instanceSsbo upload size in WbDrawDispatcher, add a flat varying out, // and consume in mesh_modern.frag. }; struct BatchData { uvec2 textureHandle; // bindless handle for sampler2DArray uint textureLayer; // layer index (always 0 for per-instance composites) uint flags; // reserved }; layout(std430, binding = 0) readonly buffer InstanceBuffer { InstanceData Instances[]; }; layout(std430, binding = 1) readonly buffer BatchBuffer { BatchData Batches[]; }; uniform mat4 uViewProjection; out vec3 vNormal; out vec2 vTexCoord; out vec3 vWorldPos; out flat uvec2 vTextureHandle; out flat uint vTextureLayer; void main() { int instanceIndex = gl_BaseInstanceARB + gl_InstanceID; mat4 model = Instances[instanceIndex].transform; vec4 worldPos = model * vec4(aPosition, 1.0); gl_Position = uViewProjection * worldPos; vWorldPos = worldPos.xyz; vNormal = normalize(mat3(model) * aNormal); vTexCoord = aTexCoord; BatchData b = Batches[gl_DrawIDARB]; vTextureHandle = b.textureHandle; vTextureLayer = b.textureLayer; } ``` (The vert from Step 5.2 should be REPLACED with this. The two are the same except for `vWorldPos` and a small comment cleanup.) - [ ] **Step 5.5: Build to verify shaders are copied to output** Run: `dotnet build src/AcDream.App/AcDream.App.csproj` Expected: PASS. After build, check `src/AcDream.App/bin/Debug/net10.0/Rendering/Shaders/` contains `mesh_modern.vert` + `mesh_modern.frag`. - [ ] **Step 5.6: Commit** ``` phase(N.5) Task 5: mesh_modern.vert + .frag — bindless + SSBO + indirect New entity shaders modeled on WB's StaticObjectModern.* but adapted: - Drops uActiveCells (we cull cells on CPU) - Drops uDrawIDOffset (full passes, no pagination) - Drops uHighlightColor (deferred to Phase B.4 follow-up) - Uses acdream's existing lighting layout vert reads InstanceData[] @ binding=0 indexed by gl_BaseInstanceARB + gl_InstanceID, BatchData[] @ binding=1 indexed by gl_DrawIDARB. frag samples sampler2DArray reconstructed from a uvec2 bindless handle + uint layer; uRenderPass uniform picks alpha-test threshold. Not yet wired to the dispatcher — Task 7 swaps shader load, Tasks 9-10 swap the draw loop. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 6: Wire mesh_modern shader load + capability check in GameWindow **Files:** - Modify: `src/AcDream.App/Rendering/GameWindow.cs` - [ ] **Step 6.1: Read existing mesh_instanced load site** Read `src/AcDream.App/Rendering/GameWindow.cs:960-980` (around the `_meshShader = new Shader(...)` line). Note the surrounding context — the WB foundation flag check, how the dispatcher is constructed. - [ ] **Step 6.2: Add capability-gated mesh_modern load** Find this block: ```csharp _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_instanced.vert"), Path.Combine(shadersDir, "mesh_instanced.frag")); ``` Replace with: ```csharp // N.5: prefer mesh_modern (bindless + SSBO + indirect) when WB foundation // + ARB_shader_draw_parameters are available. Falls back to legacy // mesh_instanced if any capability is missing — same code path as // ACDREAM_USE_WB_FOUNDATION=0. bool wbFoundationOn = WbFoundationFlag.IsEnabled; bool useModernShader = false; if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null) { if (bindless.HasShaderDrawParameters(_gl)) { try { _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_modern.vert"), Path.Combine(shadersDir, "mesh_modern.frag")); _bindlessSupport = bindless; useModernShader = true; Console.WriteLine("[N.5] mesh_modern shader loaded (bindless + ARB_shader_draw_parameters)"); } catch (Exception ex) { Console.WriteLine($"[N.5] mesh_modern compile failed, falling back: {ex.Message}"); } } else { Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present, using legacy shader"); } } if (!useModernShader) { _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_instanced.vert"), Path.Combine(shadersDir, "mesh_instanced.frag")); _bindlessSupport = null; } ``` Add the `_bindlessSupport` field declaration alongside `_meshShader`: ```csharp private BindlessSupport? _bindlessSupport; ``` Also add `using AcDream.App.Rendering.Wb;` at the top of the file if not already there. - [ ] **Step 6.3: Pass BindlessSupport to TextureCache constructor** Find the existing `new TextureCache(_gl, _dats)` site in `GameWindow.cs`. Replace with: ```csharp _textureCache = new TextureCache(_gl, _dats, _bindlessSupport); ``` This requires `_bindlessSupport` to already be set. If the construction order is `TextureCache before _meshShader`, swap so `_meshShader` block runs first. Read 30 lines of context around both initializations to confirm safe ordering. - [ ] **Step 6.4: Build + smoke test** Run: `dotnet build` Expected: PASS. Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"` Expected: 60+ tests PASS. Smoke launch (manual, optional at this point — modern shader loaded but dispatcher still uses legacy draw path so visual should be identical to N.4): ```powershell $env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call" $env:ACDREAM_LIVE = "1" dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task6.log ``` Expected: launch logs show `[N.5] mesh_modern shader loaded` line. Visual is broken (modern shader is loaded but dispatcher's per-group draw loop hands it the wrong data layout) — this is fine, expected, and gets fixed in Tasks 7-10. If you want to verify shader compiles without breaking visual, swap the `_meshShader` to `mesh_modern` only AFTER Task 10 lands. **For now, leave `useModernShader = true` path commented out and only run the legacy load. Tasks 9-10 flip it on.** Update the block: ```csharp if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null) { if (bindless.HasShaderDrawParameters(_gl)) { // Capability detected — store the support for later tasks. // Shader swap happens in Task 10 once dispatcher is ready. _bindlessSupport = bindless; Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)"); } } // Legacy shader load happens unconditionally for Task 6: _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_instanced.vert"), Path.Combine(shadersDir, "mesh_instanced.frag")); ``` Task 10 will switch the shader load. Task 6 just plumbs `_bindlessSupport` so Task 7+ can use it. - [ ] **Step 6.5: Commit** ``` phase(N.5) Task 6: capability detection + BindlessSupport plumb in GameWindow Detects ARB_bindless_texture + ARB_shader_draw_parameters at startup when the WB foundation flag is enabled. Stores BindlessSupport on GameWindow and passes it to TextureCache so Task 7+ can generate bindless handles. Mesh shader load remains mesh_instanced for now — Task 10 swaps to mesh_modern after the dispatcher is rewired. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 7: Add SSBO + indirect buffer infrastructure to WbDrawDispatcher **Files:** - Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` - Create: `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs` - [ ] **Step 7.1: Create DrawElementsIndirectCommand struct** Create `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs`: ```csharp using System.Runtime.InteropServices; namespace AcDream.App.Rendering.Wb; /// /// Layout matches what glMultiDrawElementsIndirect expects. /// Total size 20 bytes; arrays are typically uploaded with stride = sizeof(this). /// [StructLayout(LayoutKind.Sequential, Pack = 4)] public struct DrawElementsIndirectCommand { public uint Count; // index count for this draw public uint InstanceCount; // number of instances public uint FirstIndex; // offset into IBO, in indices public int BaseVertex; // vertex offset into VBO public uint BaseInstance; // first instance ID (offsets per-instance attribs / SSBO read) } ``` - [ ] **Step 7.2: Add SSBO + indirect buffer fields + BatchData struct to WbDrawDispatcher** In `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`, add at the top of the class (replacing the existing `_instanceVbo` field): ```csharp private readonly BindlessSupport _bindless; // SSBO buffer ids private uint _instanceSsbo; private uint _batchSsbo; private uint _indirectBuffer; // Per-frame scratch arrays private float[] _instanceData = new float[256 * 16]; // mat4 floats per instance private BatchData[] _batchData = new BatchData[256]; private DrawElementsIndirectCommand[] _indirectCommands = new DrawElementsIndirectCommand[256]; private int _opaqueDrawCount; private int _transparentDrawCount; private int _transparentByteOffset; [StructLayout(LayoutKind.Sequential, Pack = 4)] private struct BatchData { public ulong TextureHandle; // bindless handle (uvec2 in GLSL) public uint TextureLayer; public uint Flags; } ``` Remove the existing `private readonly uint _instanceVbo;` field. - [ ] **Step 7.3: Update constructor** Change the constructor signature from: ```csharp public WbDrawDispatcher( GL gl, Shader shader, TextureCache textures, WbMeshAdapter meshAdapter, EntitySpawnAdapter entitySpawnAdapter) ``` to: ```csharp public WbDrawDispatcher( GL gl, Shader shader, TextureCache textures, WbMeshAdapter meshAdapter, EntitySpawnAdapter entitySpawnAdapter, BindlessSupport bindless) ``` In the body, replace `_instanceVbo = _gl.GenBuffer();` with: ```csharp _bindless = bindless ?? throw new ArgumentNullException(nameof(bindless)); _instanceSsbo = _gl.GenBuffer(); _batchSsbo = _gl.GenBuffer(); _indirectBuffer = _gl.GenBuffer(); ``` - [ ] **Step 7.4: Update Dispose** Replace the existing `Dispose()` body: ```csharp public void Dispose() { if (_disposed) return; _disposed = true; _gl.DeleteBuffer(_instanceSsbo); _gl.DeleteBuffer(_batchSsbo); _gl.DeleteBuffer(_indirectBuffer); } ``` - [ ] **Step 7.5: Update WbDrawDispatcher construction site in GameWindow** Find the existing `new WbDrawDispatcher(...)` call in `GameWindow.cs` and add the `_bindlessSupport!` argument (the `!` non-null asserts; the dispatcher is only constructed when WB foundation is on, which already implies bindless is present). - [ ] **Step 7.6: Build + tests** Run: `dotnet build` Expected: PASS. Run: `dotnet test --filter "FullyQualifiedName~Wb"` Expected: PASS (existing tests don't exercise the changed buffer plumbing yet — we removed `_instanceVbo` but we'll restore the draw path in Task 9). If `WbDrawDispatcher.Draw` references `_instanceVbo`, those references break. Comment out the body of `Draw()` temporarily — it'll be rewritten in Tasks 9-10. Wrap with `// TASK 9-10: rewriting`. Build must still pass. Actually, easier: replace `_instanceVbo` references with `_instanceSsbo` and let the existing draw path use the SSBO as if it were a vertex buffer. The legacy draw will be functionally broken but compile. Visual will break but only after we flip the shader in Task 10. For the scope of Tasks 7-9 we want the build to compile. The cleanest pattern: leave the existing `Draw()` method untouched except for substituting `_instanceVbo` → `_instanceSsbo`. The behavior is wrong but compiles, and Tasks 9-10 fully rewrite it. - [ ] **Step 7.7: Commit** ``` phase(N.5) Task 7: dispatcher SSBO + indirect buffer infrastructure Adds DrawElementsIndirectCommand struct (20-byte layout for glMultiDrawElementsIndirect). Replaces _instanceVbo field on WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]), _batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport constructor parameter — non-null required since the dispatcher is only constructed when WB foundation is on. Existing Draw() method substitutes _instanceVbo → _instanceSsbo for compile. Behavior temporarily wrong; Tasks 9-10 fully rewrite the draw loop. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 8: Update InstanceGroup + GroupKey for bindless handles **Files:** - Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` - [ ] **Step 8.1: Update InstanceGroup** In `WbDrawDispatcher.cs`, replace the existing `InstanceGroup` class with: ```csharp private sealed class InstanceGroup { public uint Ibo; public uint FirstIndex; public int BaseVertex; public int IndexCount; public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4) public uint TextureLayer; // 0 for per-instance composites public TranslucencyKind Translucency; public int FirstInstance; public int InstanceCount; public float SortDistance; public readonly List Matrices = new(); } ``` - [ ] **Step 8.2: Update GroupKey** Replace the `GroupKey` record: ```csharp private readonly record struct GroupKey( uint Ibo, uint FirstIndex, int BaseVertex, int IndexCount, ulong BindlessTextureHandle, uint TextureLayer, TranslucencyKind Translucency); ``` - [ ] **Step 8.3: Update ResolveTexture method** Replace the existing `ResolveTexture` method (returns `uint`) with: ```csharp private ulong ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash) { uint surfaceId = batch.Key.SurfaceId; if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0; uint overrideOrigTex = 0; bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null && meshRef.SurfaceOverrides.TryGetValue(surfaceId, out overrideOrigTex); uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null; if (entity.PaletteOverride is not null) { return _textures.GetOrUploadWithPaletteOverrideBindless( surfaceId, origTexOverride, entity.PaletteOverride, palHash); } else if (hasOrigTexOverride) { return _textures.GetOrUploadWithOrigTextureOverrideBindless(surfaceId, overrideOrigTex); } else { return _textures.GetOrUploadBindless(surfaceId); } } ``` - [ ] **Step 8.4: Update ClassifyBatches to use the new return type** Replace the existing `ClassifyBatches` to use `ulong texHandle` and pass the layer: ```csharp private void ClassifyBatches( ObjectRenderData renderData, ulong gfxObjId, Matrix4x4 model, WorldEntity entity, MeshRef meshRef, ulong palHash, AcSurfaceMetadataTable metaTable) { for (int batchIdx = 0; batchIdx < renderData.Batches.Count; batchIdx++) { var batch = renderData.Batches[batchIdx]; TranslucencyKind translucency; if (metaTable.TryLookup(gfxObjId, batchIdx, out var meta)) { translucency = meta.Translucency; } else { translucency = batch.IsAdditive ? TranslucencyKind.Additive : batch.IsTransparent ? TranslucencyKind.AlphaBlend : TranslucencyKind.Opaque; } ulong texHandle = ResolveTexture(entity, meshRef, batch, palHash); if (texHandle == 0) continue; // For per-instance composites we use 1-layer Texture2DArray, layer always 0. // When N.6 adopts WB's atlas, this becomes batch's layer index. uint texLayer = 0; var key = new GroupKey( batch.IBO, batch.FirstIndex, (int)batch.BaseVertex, batch.IndexCount, texHandle, texLayer, translucency); if (!_groups.TryGetValue(key, out var grp)) { grp = new InstanceGroup { Ibo = batch.IBO, FirstIndex = batch.FirstIndex, BaseVertex = (int)batch.BaseVertex, IndexCount = batch.IndexCount, BindlessTextureHandle = texHandle, TextureLayer = texLayer, Translucency = translucency, }; _groups[key] = grp; } grp.Matrices.Add(model); } } ``` - [ ] **Step 8.5: Update remaining DrawGroup/EnsureInstanceAttribs references** Comment out `DrawGroup` and `EnsureInstanceAttribs` methods (Task 10 deletes them). Also comment out their call sites in `Draw()`. Build will fail until Task 9-10 lands; that's expected. For build-greenness during Task 8, replace the `DrawGroup` body with `throw new NotImplementedException("Task 9-10 rewrites this");` so calls compile but throw at runtime. Visual will be broken until Task 10. That's expected. Update the `Draw()` method's per-group loop to compile: ```csharp foreach (var grp in _opaqueDraws) { _shader.SetInt("uTranslucencyKind", (int)grp.Translucency); DrawGroup(grp); // throws — Task 10 fixes } ``` (The user does NOT visually verify at this task. Build green only.) - [ ] **Step 8.6: Build** Run: `dotnet build` Expected: PASS. Run: `dotnet test --filter "FullyQualifiedName~Wb"` Expected: existing tests PASS (they're CPU-only — they don't actually invoke `DrawGroup`). - [ ] **Step 8.7: Commit** ``` phase(N.5) Task 8: InstanceGroup + GroupKey carry bindless handle + layer Replaces uint TextureHandle (32-bit GL name) with ulong BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture return type. Adds TextureLayer (always 0 for per-instance composites, becomes meaningful when WB atlas is adopted in N.6). ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants. DrawGroup body throws NotImplementedException — Task 9-10 rewrites the draw loop. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 9: Build BatchData + DEIC arrays per frame (TDD) **Files:** - Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` - Create: `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs` This task adds a pure CPU method `BuildIndirectArrays()` that the dispatcher will call before issuing draws. Unit-testable without GL context. - [ ] **Step 9.1: Write the failing test** Create `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`: ```csharp using System.Numerics; using AcDream.App.Rendering.Wb; using AcDream.Core.Meshing; using Xunit; namespace AcDream.Core.Tests.Rendering.Wb; /// /// Pure CPU test of . /// Builds a synthetic group set and verifies the laid-out indirect commands /// match the spec §5 walk-through. /// public sealed class WbDrawDispatcherIndirectBuilderTests { [Fact] public void TwoOpaqueGroupsAndOneTransparent_LaysOutContiguouslyOpaqueFirst() { // Arrange — synthetic groups laid out as in spec §5 var groups = new List { new(IndexCount: 100, FirstIndex: 0, BaseVertex: 0, InstanceCount: 12, FirstInstance: 0, TextureHandle: 0xAA, TextureLayer: 0, Translucency: TranslucencyKind.Opaque), new(IndexCount: 200, FirstIndex: 100, BaseVertex: 0, InstanceCount: 12, FirstInstance: 12, TextureHandle: 0xBB, TextureLayer: 0, Translucency: TranslucencyKind.AlphaBlend), new(IndexCount: 50, FirstIndex: 300, BaseVertex: 100, InstanceCount: 1, FirstInstance: 24, TextureHandle: 0xCC, TextureLayer: 0, Translucency: TranslucencyKind.Opaque), }; var indirect = new DrawElementsIndirectCommand[16]; var batch = new WbDrawDispatcher.BatchDataPublic[16]; // Act var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch); // Assert layout Assert.Equal(2, result.OpaqueCount); Assert.Equal(1, result.TransparentCount); Assert.Equal(2 * 20, result.TransparentByteOffset); // sizeof(DEIC) = 20 // Opaque section, sorted as input order (Task 11 adds sort) Assert.Equal(100u, indirect[0].Count); Assert.Equal(0u, indirect[0].FirstIndex); Assert.Equal(0, indirect[0].BaseVertex); Assert.Equal(12u, indirect[0].InstanceCount); Assert.Equal(0u, indirect[0].BaseInstance); Assert.Equal(50u, indirect[1].Count); Assert.Equal(300u, indirect[1].FirstIndex); Assert.Equal(100, indirect[1].BaseVertex); Assert.Equal(1u, indirect[1].InstanceCount); Assert.Equal(24u, indirect[1].BaseInstance); // Transparent section Assert.Equal(200u, indirect[2].Count); Assert.Equal(100u, indirect[2].FirstIndex); Assert.Equal(12u, indirect[2].InstanceCount); Assert.Equal(12u, indirect[2].BaseInstance); // BatchData parallel Assert.Equal(0xAAul, batch[0].TextureHandle); Assert.Equal(0xCCul, batch[1].TextureHandle); Assert.Equal(0xBBul, batch[2].TextureHandle); } [Fact] public void EmptyGroupList_ProducesZeroCounts() { var groups = new List(); var indirect = new DrawElementsIndirectCommand[0]; var batch = new WbDrawDispatcher.BatchDataPublic[0]; var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch); Assert.Equal(0, result.OpaqueCount); Assert.Equal(0, result.TransparentCount); Assert.Equal(0, result.TransparentByteOffset); } } ``` - [ ] **Step 9.2: Run, verify it fails** Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"` Expected: COMPILE FAIL — `BuildIndirectArrays` and supporting public types don't exist. - [ ] **Step 9.3: Implement BuildIndirectArrays + supporting types** In `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`, add public helper types + static method (above the private `InstanceGroup` class): ```csharp /// Public view of the per-group inputs to — used in tests. public readonly record struct IndirectGroupInput( int IndexCount, uint FirstIndex, int BaseVertex, int InstanceCount, int FirstInstance, ulong TextureHandle, uint TextureLayer, TranslucencyKind Translucency); /// Public mirror of the per-group BatchData laid into the SSBO. Tests verify alignment. // Pack=8 (not 4) — must stay layout-identical to private BatchData for Task 10's MemoryMarshal.Cast. [StructLayout(LayoutKind.Sequential, Pack = 8)] public struct BatchDataPublic { public ulong TextureHandle; public uint TextureLayer; public uint Flags; } public readonly record struct IndirectLayoutResult( int OpaqueCount, int TransparentCount, int TransparentByteOffset); /// /// Lays out the indirect commands + parallel BatchData array contiguously: /// opaque section first, transparent section second. Pure CPU, no GL state. /// Caller passes scratch arrays (pre-sized). /// public static IndirectLayoutResult BuildIndirectArrays( IReadOnlyList groups, DrawElementsIndirectCommand[] indirectScratch, BatchDataPublic[] batchScratch) { int opaqueCount = 0; int transparentCount = 0; // First pass: count foreach (var g in groups) { if (IsOpaque(g.Translucency)) opaqueCount++; else transparentCount++; } // Second pass: lay out — opaque [0..opaqueCount), transparent [opaqueCount..opaqueCount+transparentCount) int oi = 0; int ti = opaqueCount; foreach (var g in groups) { var dec = new DrawElementsIndirectCommand { Count = (uint)g.IndexCount, InstanceCount = (uint)g.InstanceCount, FirstIndex = g.FirstIndex, BaseVertex = g.BaseVertex, BaseInstance = (uint)g.FirstInstance, }; var bd = new BatchDataPublic { TextureHandle = g.TextureHandle, TextureLayer = g.TextureLayer, Flags = 0, }; if (IsOpaque(g.Translucency)) { indirectScratch[oi] = dec; batchScratch[oi] = bd; oi++; } else { indirectScratch[ti] = dec; batchScratch[ti] = bd; ti++; } } return new IndirectLayoutResult(opaqueCount, transparentCount, opaqueCount * DrawCommandStride); } private static bool IsOpaque(TranslucencyKind t) => t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap; ``` - [ ] **Step 9.4: Run test, verify pass** Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"` Expected: PASS (2 tests). Run full filter: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"` Expected: 60+ existing tests + 2 new = PASS. - [ ] **Step 9.5: Commit** ``` phase(N.5) Task 9: BuildIndirectArrays — CPU layout for indirect dispatch Pure CPU helper that lays out a group list into a contiguous indirect buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] — opaque section first, transparent section second. Returns counts + byte offset for the transparent section. Tests cover the spec §5 walk-through layout: per-group fields propagate correctly, opaque/transparent partition lands at the expected indices. Static + public so tests can exercise without a GL context. Tasks 10-11 wire it into Draw(). Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 10: Replace draw loop with glMultiDrawElementsIndirect (visual verification) **Files:** - Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` - Modify: `src/AcDream.App/Rendering/GameWindow.cs` This is the load-bearing task. After this lands, visual verification is required. - [ ] **Step 10.1: Rewrite WbDrawDispatcher.Draw** Replace the entire `Draw()` method body in `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`. The phase 1-3 (entity walk, group bucketing, matrix layout) stay; phases 4-6 are rewritten: ```csharp public unsafe void Draw( ICamera camera, IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList Entities)> landblockEntries, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null, HashSet? visibleCellIds = null, HashSet? animatedEntityIds = null) { _shader.Use(); var vp = camera.View * camera.Projection; _shader.SetMatrix4("uViewProjection", vp); // Lighting uniforms — match what mesh_modern.frag declares (Task 5.3). // Read the existing N.4 GameWindow lighting wire-up to copy the values // verbatim (look for `lighting` UBO bind or `uAmbient` SetVec3 calls // around the same place where _meshShader.Use() / SetMatrix4 happens). // If N.4 used a UBO: change mesh_modern.frag in Task 5.3 to match the UBO, // then bind the UBO here via `_gl.BindBufferBase(UniformBuffer, 1, lightingUbo)`. // If N.4 used uniforms: replicate the same SetVec3 calls here. bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal); Vector3 camPos = Vector3.Zero; if (Matrix4x4.Invert(camera.View, out var invView)) camPos = invView.Translation; // ── Phases 1-2: walk entities, build groups, lay matrices ─────────── foreach (var grp in _groups.Values) grp.Matrices.Clear(); var metaTable = _meshAdapter.MetadataTable; uint anyVao = 0; foreach (var entry in landblockEntries) { bool landblockVisible = frustum is null || entry.LandblockId == neverCullLandblockId || FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax); if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0)) continue; foreach (var entity in entry.Entities) { if (entity.MeshRefs.Count == 0) continue; bool isAnimated = animatedEntityIds?.Contains(entity.Id) == true; if (!landblockVisible && !isAnimated) continue; if (entity.ParentCellId.HasValue && visibleCellIds is not null && !visibleCellIds.Contains(entity.ParentCellId.Value)) continue; if (frustum is not null && !isAnimated && entry.LandblockId != neverCullLandblockId) { var p = entity.Position; var aMin = new Vector3(p.X - PerEntityCullRadius, p.Y - PerEntityCullRadius, p.Z - PerEntityCullRadius); var aMax = new Vector3(p.X + PerEntityCullRadius, p.Y + PerEntityCullRadius, p.Z + PerEntityCullRadius); if (!FrustumCuller.IsAabbVisible(frustum.Value, aMin, aMax)) continue; } if (diag) _entitiesSeen++; var entityWorld = Matrix4x4.CreateFromQuaternion(entity.Rotation) * Matrix4x4.CreateTranslation(entity.Position); ulong palHash = 0; if (entity.PaletteOverride is not null) palHash = TextureCache.HashPaletteOverride(entity.PaletteOverride); bool drewAny = false; for (int partIdx = 0; partIdx < entity.MeshRefs.Count; partIdx++) { var meshRef = entity.MeshRefs[partIdx]; ulong gfxObjId = meshRef.GfxObjId; var renderData = _meshAdapter.TryGetRenderData(gfxObjId); if (renderData is null) { if (diag) _meshesMissing++; continue; } drewAny = true; if (anyVao == 0) anyVao = renderData.VAO; if (renderData.IsSetup && renderData.SetupParts.Count > 0) { foreach (var (partGfxObjId, partTransform) in renderData.SetupParts) { var partData = _meshAdapter.TryGetRenderData(partGfxObjId); if (partData is null) continue; var model = ComposePartWorldMatrix(entityWorld, meshRef.PartTransform, partTransform); ClassifyBatches(partData, partGfxObjId, model, entity, meshRef, palHash, metaTable); } } else { var model = meshRef.PartTransform * entityWorld; ClassifyBatches(renderData, gfxObjId, model, entity, meshRef, palHash, metaTable); } } if (diag && drewAny) _entitiesDrawn++; } } if (anyVao == 0) { if (diag) MaybeFlushDiag(); return; } int totalInstances = 0; foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count; if (totalInstances == 0) { if (diag) MaybeFlushDiag(); return; } // ── Phase 3: assign FirstInstance per group, lay matrices contiguous ─ int needed = totalInstances * 16; if (_instanceData.Length < needed) _instanceData = new float[needed + 256 * 16]; _opaqueDraws.Clear(); _translucentDraws.Clear(); int cursor = 0; foreach (var grp in _groups.Values) { if (grp.Matrices.Count == 0) continue; grp.FirstInstance = cursor; grp.InstanceCount = grp.Matrices.Count; var first = grp.Matrices[0]; var grpPos = new Vector3(first.M41, first.M42, first.M43); grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos); for (int i = 0; i < grp.Matrices.Count; i++) { WriteMatrix(_instanceData, cursor * 16, grp.Matrices[i]); cursor++; } if (IsOpaqueGroup(grp.Translucency)) _opaqueDraws.Add(grp); else _translucentDraws.Add(grp); } _opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance)); // ── Phase 4: build BatchData + DEIC arrays ────────────────────────── int totalDraws = _opaqueDraws.Count + _translucentDraws.Count; if (_batchData.Length < totalDraws) _batchData = new BatchData[totalDraws + 64]; if (_indirectCommands.Length < totalDraws) _indirectCommands = new DrawElementsIndirectCommand[totalDraws + 64]; var groupInputs = new List(totalDraws); foreach (var g in _opaqueDraws) groupInputs.Add(ToInput(g)); foreach (var g in _translucentDraws) groupInputs.Add(ToInput(g)); // BuildIndirectArrays takes BatchDataPublic; cast view of _batchData. // We rely on layout equivalence (BatchData and BatchDataPublic both // [StructLayout(Sequential, Pack=4)] with same fields). var batchView = MemoryMarshal.Cast(_batchData); var layout = BuildIndirectArrays(groupInputs, _indirectCommands, batchView.ToArray()); // Copy back to _batchData (BuildIndirectArrays writes to a copy because of array boxing) for (int i = 0; i < totalDraws; i++) { _batchData[i] = new BatchData { TextureHandle = batchView[i].TextureHandle, TextureLayer = batchView[i].TextureLayer, Flags = batchView[i].Flags, }; } _opaqueDrawCount = layout.OpaqueCount; _transparentDrawCount = layout.TransparentCount; _transparentByteOffset = layout.TransparentByteOffset; // ── Phase 5: upload three buffers ─────────────────────────────────── fixed (float* ip = _instanceData) UploadSsbo(_instanceSsbo, 0, ip, totalInstances * 16 * sizeof(float)); fixed (BatchData* bp = _batchData) UploadSsbo(_batchSsbo, 1, bp, totalDraws * sizeof(BatchData)); fixed (DrawElementsIndirectCommand* cp = _indirectCommands) { _gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer); _gl.BufferData(BufferTargetARB.DrawIndirectBuffer, (nuint)(totalDraws * sizeof(DrawElementsIndirectCommand)), cp, BufferUsageARB.DynamicDraw); } // ── Phase 6: bind global VAO once ─────────────────────────────────── _gl.BindVertexArray(anyVao); if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal)) _gl.Disable(EnableCap.CullFace); // ── Phase 7: opaque pass ─────────────────────────────────────────── if (_opaqueDrawCount > 0) { _gl.Disable(EnableCap.Blend); _gl.DepthMask(true); _shader.SetInt("uRenderPass", 0); _gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer); _gl.MultiDrawElementsIndirect( PrimitiveType.Triangles, DrawElementsType.UnsignedShort, indirect: (void*)0, drawcount: (uint)_opaqueDrawCount, stride: (uint)sizeof(DrawElementsIndirectCommand)); } // ── Phase 8: transparent pass ────────────────────────────────────── if (_transparentDrawCount > 0) { _gl.Enable(EnableCap.Blend); _gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha); _gl.DepthMask(false); _shader.SetInt("uRenderPass", 1); _gl.MultiDrawElementsIndirect( PrimitiveType.Triangles, DrawElementsType.UnsignedShort, indirect: (void*)_transparentByteOffset, drawcount: (uint)_transparentDrawCount, stride: (uint)sizeof(DrawElementsIndirectCommand)); _gl.DepthMask(true); _gl.Disable(EnableCap.Blend); } _gl.Disable(EnableCap.CullFace); _gl.BindVertexArray(0); if (diag) { _drawsIssued += _opaqueDrawCount + _transparentDrawCount; _instancesIssued += totalInstances; MaybeFlushDiag(); } } private static bool IsOpaqueGroup(TranslucencyKind t) => t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap; private static IndirectGroupInput ToInput(InstanceGroup g) => new( IndexCount: g.IndexCount, FirstIndex: g.FirstIndex, BaseVertex: g.BaseVertex, InstanceCount: g.InstanceCount, FirstInstance: g.FirstInstance, TextureHandle: g.BindlessTextureHandle, TextureLayer: g.TextureLayer, Translucency: g.Translucency); private unsafe void UploadSsbo(uint ssbo, uint binding, void* data, int byteCount) { _gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, ssbo); _gl.BufferData(BufferTargetARB.ShaderStorageBuffer, (nuint)byteCount, data, BufferUsageARB.DynamicDraw); _gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, binding, ssbo); } ``` Delete the old `DrawGroup`, `EnsureInstanceAttribs`, and `ResolveTexture` (the old uint-returning version) methods — they're no longer called. - [ ] **Step 10.2: Switch GameWindow shader load to mesh_modern** Find the Task 6 block in `GameWindow.cs` and change the shader load from `mesh_instanced` to `mesh_modern` when `_bindlessSupport != null`: ```csharp if (_bindlessSupport is not null) { _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_modern.vert"), Path.Combine(shadersDir, "mesh_modern.frag")); Console.WriteLine("[N.5] mesh_modern shader loaded"); } else { _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_instanced.vert"), Path.Combine(shadersDir, "mesh_instanced.frag")); } ``` - [ ] **Step 10.3: Build + run all tests** Run: `dotnet build` Expected: PASS. Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"` Expected: 60+ tests + 2 new BuildIndirectArrays tests PASS. - [ ] **Step 10.4: Visual smoke test (USER GATE)** Launch: ```powershell $env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call" $env:ACDREAM_LIVE = "1" $env:ACDREAM_TEST_HOST = "127.0.0.1" $env:ACDREAM_TEST_PORT = "9000" $env:ACDREAM_TEST_USER = "testaccount" $env:ACDREAM_TEST_PASS = "testpassword" $env:ACDREAM_WB_DIAG = "1" dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task10.log ``` Expected: - Console shows `[N.5] mesh_modern shader loaded`. - Holtburg renders with characters + scenery + buildings visible. - `[WB-DIAG]` shows draws dropping from N.4's hundreds to ~3-5 per frame for entity rendering. User confirms visual identity. If broken, debug — most likely failure modes: 1. Shader compile failure → console log will show GLSL info log; fix vert/frag. 2. Black textures everywhere → bindless handle generation broken; check `_bindless` is non-null in TextureCache. 3. Wrong geometry → BaseVertex / FirstIndex misaligned; verify against N.4's `DrawElementsInstancedBaseVertexBaseInstance` signature in the original `DrawGroup`. 4. Wrong matrices on entities → InstanceSsbo upload size wrong; verify `totalInstances * 16 * sizeof(float)`. - [ ] **Step 10.5: Commit only after visual verification passes** ``` phase(N.5) Task 10: glMultiDrawElementsIndirect dispatch — visual verified Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance loop with two glMultiDrawElementsIndirect calls (opaque + transparent). Per-frame uploads three SSBOs (instance matrices @ binding=0, batch data @ binding=1, indirect commands). Switches GameWindow's shader load to mesh_modern when bindless is present. Visual verification: Holtburg courtyard renders identical to N.4. Entity draw calls drop from "few hundred per pass" to 1 per pass. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 11: Update ClassifyBatches for translucency restructure (TDD) **Files:** - Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` - Create: `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs` Per Decision 2: `Additive` and `InvAlpha` merge into transparent (alpha-blend). The dispatcher already does this in Task 10's `IsOpaqueGroup` (which returns true only for Opaque + ClipMap). This task ADDS a unit test and tightens the contract. - [ ] **Step 11.1: Write the failing test** Create `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`: ```csharp using AcDream.App.Rendering.Wb; using AcDream.Core.Meshing; using Xunit; namespace AcDream.Core.Tests.Rendering.Wb; /// /// Locks in the N.5 translucency partition contract (Decision 2): /// Opaque + ClipMap → opaque indirect; AlphaBlend + Additive + InvAlpha → transparent. /// public sealed class WbDrawDispatcherTranslucencyTests { [Theory] [InlineData(TranslucencyKind.Opaque, true)] [InlineData(TranslucencyKind.ClipMap, true)] [InlineData(TranslucencyKind.AlphaBlend, false)] [InlineData(TranslucencyKind.Additive, false)] [InlineData(TranslucencyKind.InvAlpha, false)] public void IsOpaque_PartitionsByKind(TranslucencyKind kind, bool expected) { Assert.Equal(expected, WbDrawDispatcher.IsOpaquePublic(kind)); } } ``` - [ ] **Step 11.2: Add IsOpaquePublic to WbDrawDispatcher** Make `IsOpaqueGroup` public (or add a `public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t);` shim): ```csharp public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t); ``` - [ ] **Step 11.3: Run test, verify PASS** Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherTranslucency"` Expected: 5 tests PASS. Run all: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"` Expected: 60+ + 2 + 5 = 67+ PASS. - [ ] **Step 11.4: Commit** ``` phase(N.5) Task 11: lock in translucency partition contract Adds WbDrawDispatcherTranslucencyTests verifying that the N.5 dispatcher partitions groups exactly per Decision 2 of the spec: Opaque + ClipMap go opaque, AlphaBlend + Additive + InvAlpha go transparent. Catches future refactors that drift the partition. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 12: Add CPU stopwatch + GL timer query timing in [WB-DIAG] **Files:** - Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` - [ ] **Step 12.1: Add timing fields** In `WbDrawDispatcher.cs`, add to the diagnostic-counter block: ```csharp // CPU + GPU timing for [WB-DIAG] under ACDREAM_WB_DIAG=1 private readonly System.Diagnostics.Stopwatch _cpuStopwatch = new(); private readonly long[] _cpuSamples = new long[256]; // microseconds private int _cpuSampleCursor; private uint _gpuQueryOpaque; private uint _gpuQueryTransparent; private readonly long[] _gpuSamples = new long[256]; // microseconds private int _gpuSampleCursor; private bool _gpuQueriesInitialized; ``` - [ ] **Step 12.2: Initialize GPU queries lazily in Draw()** At the top of `Draw()` (after `_shader.Use()` but before `bool diag = ...`), add: ```csharp if (diag && !_gpuQueriesInitialized) { _gpuQueryOpaque = _gl.GenQuery(); _gpuQueryTransparent = _gl.GenQuery(); _gpuQueriesInitialized = true; } ``` - [ ] **Step 12.3: Wrap the draw passes with timing** Replace `if (diag) _cpuStopwatch.Restart();` semantics — use a top-of-method `_cpuStopwatch.Restart();` (always on, cheap) and only LOG under diag. At the very top of `Draw()` (just inside the method): ```csharp _cpuStopwatch.Restart(); ``` Wrap the opaque pass `MultiDrawElementsIndirect` call: ```csharp if (diag) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryOpaque); _gl.MultiDrawElementsIndirect(...); // existing call if (diag) _gl.EndQuery(QueryTarget.TimeElapsed); ``` Same for transparent pass with `_gpuQueryTransparent`. At the bottom of `Draw()` (after `_gl.BindVertexArray(0)`): ```csharp _cpuStopwatch.Stop(); if (diag) { long cpuUs = _cpuStopwatch.ElapsedTicks * 1_000_000L / System.Diagnostics.Stopwatch.Frequency; _cpuSamples[_cpuSampleCursor] = cpuUs; _cpuSampleCursor = (_cpuSampleCursor + 1) % _cpuSamples.Length; // GPU sample read — non-blocking, may not be ready yet on first frames int avail = 0; _gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResultAvailable, out avail); if (avail != 0) { _gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResult, out long opaqueNs); _gl.GetQueryObject(_gpuQueryTransparent, QueryObjectParameterName.QueryResult, out long transNs); long gpuUs = (opaqueNs + transNs) / 1000; _gpuSamples[_gpuSampleCursor] = gpuUs; _gpuSampleCursor = (_gpuSampleCursor + 1) % _gpuSamples.Length; } } ``` - [ ] **Step 12.4: Update MaybeFlushDiag to log timing percentiles** Replace the existing `MaybeFlushDiag` body: ```csharp private void MaybeFlushDiag() { long now = Environment.TickCount64; if (now - _lastLogTick > 5000) { long cpuMed = MedianMicros(_cpuSamples); long cpuP95 = Percentile95Micros(_cpuSamples); long gpuMed = MedianMicros(_gpuSamples); long gpuP95 = Percentile95Micros(_gpuSamples); Console.WriteLine( $"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count} " + $"cpu_us={cpuMed}m/{cpuP95}p95 gpu_us={gpuMed}m/{gpuP95}p95"); _entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0; _lastLogTick = now; } } private static long MedianMicros(long[] samples) { var copy = (long[])samples.Clone(); Array.Sort(copy); int nz = 0; foreach (var v in copy) if (v > 0) { nz++; } if (nz == 0) return 0; return copy[copy.Length - nz / 2]; } private static long Percentile95Micros(long[] samples) { var copy = (long[])samples.Clone(); Array.Sort(copy); int nz = 0; foreach (var v in copy) if (v > 0) { nz++; } if (nz == 0) return 0; int idx = copy.Length - 1 - (int)(nz * 0.05); return copy[idx]; } ``` - [ ] **Step 12.5: Update Dispose** Add to `Dispose()`: ```csharp if (_gpuQueriesInitialized) { _gl.DeleteQuery(_gpuQueryOpaque); _gl.DeleteQuery(_gpuQueryTransparent); } ``` - [ ] **Step 12.6: Build + smoke test** Run: `dotnet build` Expected: PASS. Smoke launch with `ACDREAM_WB_DIAG=1`. Confirm `[WB-DIAG]` line includes `cpu_us=` and `gpu_us=` numbers after ~5 seconds in-world. - [ ] **Step 12.7: Commit** ``` phase(N.5) Task 12: CPU stopwatch + GL_TIME_ELAPSED queries in [WB-DIAG] Adds median + 95th-percentile CPU + GPU dispatch time to the existing 5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap; only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED queries (opaque + transparent), polled non-blocking on next frame. Numbers populate the SHIP commit message (Task 20). Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 13: Capture before/after perf numbers (USER GATE) **Files:** - (none — measurement task) - [ ] **Step 13.1: Capture N.5 numbers in Holtburg courtyard** Launch acdream with `ACDREAM_WB_DIAG=1`. Position character at Holtburg courtyard, 30m elevated, looking SW. Stand still for ~30 seconds. Read the `[WB-DIAG]` line. Record: ``` N.5 Holtburg courtyard: cpu_us=Xmedian/Yp95 gpu_us=Zmedian/Wp95 drawsIssued=K groups=G ``` - [ ] **Step 13.2: Capture N.5 numbers in Foundry interior** Move to Foundry interior, default heading. Same 30s. Record same metrics. - [ ] **Step 13.3: Compare against N.4 baseline** Stash N.5 changes: ```bash git stash git checkout c445364 # N.4 SHIP dotnet build ``` Repeat measurements with N.4 active. Record numbers in the same format. Compare: | Scene | N.4 cpu med | N.5 cpu med | Δ% | N.4 gpu med | N.5 gpu med | Δ% | N.4 draws | N.5 draws | |---|---|---|---|---|---|---|---|---| | Holtburg courtyard | | | | | | | | | | Foundry interior | | | | | | | | | Restore N.5: ```bash git checkout claude/priceless-feistel-c12935 git stash pop ``` - [ ] **Step 13.4: Verify acceptance gates** Acceptance per spec §8.3: - [ ] CPU dispatcher time ≤ 70% of N.4 in Holtburg courtyard (target: ≥30% reduction). - [ ] GPU rendering time within ±10% of N.4 (sanity). - [ ] `drawsIssued ≤ 5 per pass`. If gates fail: investigate. Common causes: - Per-frame `glBufferData` is the bottleneck → defer to N.6 persistent-mapping (per Decision 7). - SSBO indexing slower than expected on driver → check NVidia / AMD / Intel separately. - Group bucketing not sharing groups well → `groups` count dominates `drawsIssued`. Save the table to a file: `docs/plans/2026-05-08-phase-n5-perf-baseline.md`. This goes in the SHIP commit. - [ ] **Step 13.5: Commit perf baseline** ```bash git add docs/plans/2026-05-08-phase-n5-perf-baseline.md git commit -m "phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry [heredoc body]" ``` Heredoc body: ``` phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry Captures CPU + GPU + draw-count numbers for the SHIP gate. Acceptance gates: - CPU dispatcher time ≤ 70% of N.4: [PASS / FAIL] - GPU rendering time within ±10% of N.4: [PASS / FAIL] - drawsIssued ≤ 5 per pass: [PASS / FAIL] Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 14: Visual verification at Holtburg + Foundry + magic content (USER GATE) **Files:** - (none — verification task; only commits if regressions found) - [ ] **Step 14.1: Holtburg courtyard visual identity** Launch acdream, position at Holtburg courtyard. Compare side-by-side against N.4 (use git stash + checkout flow from Task 13 if needed). Confirm: - All scenery (trees, fences, rocks, buildings) renders correctly. - No missing entities. - No z-fighting introduced. - No exploded character parts. - [ ] **Step 14.2: Foundry interior visual identity** Move to Foundry. Confirm same checklist. Pay attention to dense static-object scenes. - [ ] **Step 14.3: Indoor → outdoor transition** Walk through portal/door from outdoors to indoors and back. Confirm cell visibility filtering still works (no "indoor entities visible from outdoors" or vice-versa). - [ ] **Step 14.4: Drudge / character close-up** Find a drudge or NPC. Walk close. Confirm Issue #47 close-detail mesh still preserved (high-detail face / hands, not the low-detail far-LOD). - [ ] **Step 14.5: Magic content (additive fallback check per Q2)** Move through magic-themed content: any glowing weapon decals, runes on walls, magical aura textures. Compare against N.4. If anything appears "darker" or "less luminous" → that's the Decision 2 additive regression. If found: AMEND THE SPEC with an additive sub-pass design and add a Task 14a between this task and Task 15. Do NOT proceed to ship without resolving. - [ ] **Step 14.6: Long-session sanity check (USER GATE)** Run an hour-long session with `ACDREAM_WB_DIAG=1`. Watch the `[WB-DIAG]` resident handle count grow (you'll need to add a `bindlessHandlesCount` field to the diag log — small task; if not done, just monitor process VRAM via Task Manager / similar). Expected: bounded plateau under 5K handles. If unbounded growth: file an N.6 follow-up issue, don't block the ship. - [ ] **Step 14.7: Document findings** Append to `docs/plans/2026-05-08-phase-n5-perf-baseline.md`: ```markdown ## Visual verification (Task 14) - Holtburg courtyard: PASS / FAIL (note specific issues) - Foundry interior: PASS / FAIL - Cell transitions: PASS / FAIL - Character close-up (Issue #47): PASS / FAIL - Magic content (additive check): PASS / FAIL - Long-session sanity: PASS / FAIL — peak resident handles ~N ``` - [ ] **Step 14.8: Commit findings (no code change)** ``` phase(N.5) Task 14: visual verification — all gates pass [Or if any failed: amend with sub-task to address.] Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 15: Delete legacy mesh_instanced shader files **Files:** - Delete: `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert` - Delete: `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag` - Modify: `src/AcDream.App/Rendering/GameWindow.cs` (remove fallback path) This task removes the fallback shader path. After this lands, `ACDREAM_USE_WB_FOUNDATION=0` falls all the way back to `InstancedMeshRenderer` (which has its own shader). The intermediate "WB foundation on but bindless missing" state no longer exists — if bindless is missing, we treat it as foundation-off. - [ ] **Step 15.1: Delete shader files** ```bash git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.vert git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.frag ``` - [ ] **Step 15.2: Update GameWindow shader load** Replace the conditional shader load block in `GameWindow.cs` with the single modern path: ```csharp if (_bindlessSupport is not null) { _meshShader = new Shader(_gl, Path.Combine(shadersDir, "mesh_modern.vert"), Path.Combine(shadersDir, "mesh_modern.frag")); Console.WriteLine("[N.5] mesh_modern shader loaded"); } else { // Bindless missing — log and skip WbDrawDispatcher construction so // InstancedMeshRenderer handles all rendering (same effect as // ACDREAM_USE_WB_FOUNDATION=0). Console.WriteLine("[N.5] bindless extension missing — falling back to InstancedMeshRenderer"); // _meshShader stays unloaded; InstancedMeshRenderer owns its own shader path. // The `_dispatcher = new WbDrawDispatcher(...)` site below must be wrapped: // _dispatcher = (_bindlessSupport is not null) ? new WbDrawDispatcher(...) : null; // and the per-frame draw call must guard `_dispatcher?.Draw(...)`. } ``` Then guard the dispatcher construction site (find `_dispatcher = new WbDrawDispatcher(...)` in the same file): ```csharp _dispatcher = (_bindlessSupport is not null) ? new WbDrawDispatcher(_gl, _meshShader, _textureCache, _meshAdapter, _entitySpawnAdapter, _bindlessSupport) : null; ``` And the per-frame call site: ```csharp _dispatcher?.Draw(camera, landblockEntries, frustum, ...); ``` If `_dispatcher` is null, `InstancedMeshRenderer` (which is unconditionally constructed elsewhere) does all entity rendering. - [ ] **Step 15.3: Build + tests** Run: `dotnet build` Expected: PASS. Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"` Expected: PASS. - [ ] **Step 15.4: Smoke test (legacy fallback path)** Test the legacy fallback by running with foundation off: ```powershell $env:ACDREAM_USE_WB_FOUNDATION = "0" dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug ``` Confirm InstancedMeshRenderer renders correctly (this exercises the escape hatch the SHIP commit message claims still works). - [ ] **Step 15.5: Commit** ``` phase(N.5) Task 15: delete legacy mesh_instanced shader files mesh_instanced.vert + .frag deleted. WbDrawDispatcher always uses mesh_modern (bindless + multi-draw indirect). Legacy escape hatch runs via InstancedMeshRenderer + ACDREAM_USE_WB_FOUNDATION=0 — its own shader path, untouched. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 16: Update CLAUDE.md WB integration cribs **Files:** - Modify: `CLAUDE.md` - [ ] **Step 16.1: Read existing WB integration cribs section** Read `CLAUDE.md` lines 28-80 (the "WB integration cribs" section). - [ ] **Step 16.2: Add N.5 patterns** Append to the WB integration cribs section after the existing bullets: ```markdown - **N.5 modern dispatch** uses bindless textures + multi-draw indirect. `WbDrawDispatcher.Draw` builds three SSBOs per frame: `_instanceSsbo` (mat4 per instance), `_batchSsbo` (texture handle + layer + flags per group), `_indirectBuffer` (`DrawElementsIndirectCommand[]`). Two `glMultiDrawElementsIndirect` calls per frame — opaque, transparent. See `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`. - **`TextureCache` requires `BindlessSupport`** for the WB modern path. Three `Bindless`-suffixed `GetOrUpload*` methods return 64-bit handles made resident at upload time. Old `uint`-returning methods stay for Sky / Terrain / Debug renderers. - **Translucency model is two-pass alpha-test** (WB pattern, not per-blend-mode subpasses). Opaque pass discards `α<0.95`, transparent pass discards `α≥0.95`. Native `Additive` blend renders as alpha-blend on GfxObj surfaces — falsifiable; if a regression shows up on magic content, add a third indirect call with `glBlendFunc(SrcAlpha, One)`. - **Per-instance highlight (selection blink) is reserved.** `InstanceData` has a documented hook for `vec4 highlightColor` — Phase B.4 follow-up adds the field + plumbs server-side selection state. Stride grows from 64 → 80 bytes when added; shader updates trivially. ``` - [ ] **Step 16.3: Build (sanity — markdown only, but ensures no other docs broke)** Run: `dotnet build` Expected: PASS. - [ ] **Step 16.4: Commit** ``` phase(N.5) Task 16: extend CLAUDE.md WB cribs with N.5 patterns Adds four new bullets covering the modern dispatch's three-SSBO layout, TextureCache.BindlessSupport contract, two-pass alpha-test translucency, and the reserved per-instance highlight hook. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 17: Update memory + roadmap **Files:** - Create: `memory/project_phase_n5_state.md` (under user's `~/.claude/projects/.../memory/`) - Modify: `MEMORY.md` (under user's `~/.claude/projects/.../memory/`) - Modify: `docs/plans/2026-04-11-roadmap.md` Memory files live under `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\` per the `auto memory` system prompt section. - [ ] **Step 17.1: Create memory entry for N.5 state** Create `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\project_phase_n5_state.md`: ```markdown --- name: Project: Phase N.5 state (shipped 2026-05-XX) description: N.5 lifted WbDrawDispatcher onto bindless + multi-draw indirect. CPU dispatcher time dropped to ~30-40% of N.4. Three new gotchas captured. type: project --- **Phase N.5 — Modern Rendering Path — shipped 2026-05-XX.** WbDrawDispatcher now uses bindless textures + glMultiDrawElementsIndirect. Per-frame: 3 SSBO uploads + 2 indirect calls (opaque + transparent). All textures are 1-layer Texture2DArray; sampler2DArray in shader. Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`. Spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`. **Why:** N.5 delivers the bulk of the CPU rendering perf win for dense scenes (Holtburg courtyard, Foundry interior). N.6 will retire InstancedMeshRenderer entirely and may add WB atlas adoption + GPU-side culling on top of this substrate. **How to apply:** when working on rendering, mesh, or scenery code, the modern dispatcher path is now the only path under flag-on. Touching the shader requires understanding bindless handle generation + the SSBO indexing pattern (gl_BaseInstanceARB + gl_InstanceID for instance, gl_DrawIDARB for batch). ## Three gotchas surfaced during N.5 implementation [FILL IN AT SHIP TIME — common candidates:] 1. SSBO upload size off-by-one if you forget instance-stride alignment. 2. `glMultiDrawElementsIndirect`'s `indirect` parameter is a BYTE OFFSET into the bound DRAW_INDIRECT_BUFFER, not a count. 3. Bindless handle 0 is a valid-but-non-resident sentinel — guard for it before populating BatchData. ``` - [ ] **Step 17.2: Add MEMORY.md index entry** Edit `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\MEMORY.md`. Add immediately after the existing N.4 line: ```markdown - [Project: Phase N.5 state](project_phase_n5_state.md) — **N.5 SHIPPED 2026-05-XX.** WbDrawDispatcher on bindless + multi-draw indirect. CPU dispatcher ~30-40% of N.4. Three driver-touching gotchas captured. ``` - [ ] **Step 17.3: Update roadmap** Edit `docs/plans/2026-04-11-roadmap.md`. Move N.5 from "Currently in flight" to the "Shipped" table. Add N.6 as the new "in flight" or "next" entry per the user's preferred sequencing. - [ ] **Step 17.4: Commit memory + roadmap** ```bash git add docs/plans/2026-04-11-roadmap.md git commit -m "phase(N.5): roadmap — N.5 shipped, N.6 next [heredoc body]" ``` (Memory files are git-ignored — they live under `~/.claude/...` and are not committed.) Heredoc body: ``` phase(N.5): roadmap — N.5 shipped, N.6 next Moves N.5 from in-flight to Shipped. Records the perf wins from Task 13's measurement table. N.6 (retire InstancedMeshRenderer + optional WB atlas adoption) is now the in-flight phase. Co-Authored-By: Claude Opus 4.7 (1M context) ``` --- ## Task 18: Plan finalization — append SHIP section **Files:** - Modify: `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md` (this file) - [ ] **Step 18.1: Add SHIP section at the end of this plan** Append to this plan file (`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`): ```markdown --- ## SHIP record **Shipped: 2026-05-XX** at commit [SHIP commit SHA]. **Acceptance gates:** - [✓] Visual identity to N.4 — confirmed at Holtburg courtyard, Foundry interior, indoor↔outdoor transitions, drudge close-up, magic content. - [✓] CPU dispatcher time ≤ 70% of N.4 — measured: N.4=Xµs / N.5=Yµs (Z% reduction). - [✓] GPU rendering time within ±10% of N.4 — measured: N.4=Aµs / N.5=Bµs. - [✓] `drawsIssued ≤ 5 per pass` — measured: N opaque + M transparent per frame. - [✓] All tests green — 60+ N.4 tests + 7 new N.5 tests. - [✓] `ACDREAM_USE_WB_FOUNDATION=0` still works — InstancedMeshRenderer fallback verified. **Adjustments captured during execution:** [list any spec amendments — e.g., additive sub-pass added if Task 14.5 found regressions]. **Out-of-scope follow-ups (per spec §10):** - N.6: retire `InstancedMeshRenderer`. - N.6 candidate: persistent-mapped buffers if `glBufferData` shows up in profiling. - N.6 candidate: WB atlas adoption for memory savings on shared content. - Phase B.4 follow-up: per-instance `highlightColor` for selection blink. - (Long-session memory pressure — log evidence in N.6 watchlist.) ``` - [ ] **Step 18.2: Commit** ```bash git add docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md git commit -m "phase(N.5): plan finalization — SHIP record appended Co-Authored-By: Claude Opus 4.7 (1M context) " ``` --- ## Task 19: SHIP commit **Files:** - (no code change — single empty commit OR amend the perf baseline commit's message) - [ ] **Step 19.1: Verify clean tree + green build/test** ```bash git status dotnet build dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless" ``` Expected: clean tree, build PASS, all tests PASS. - [ ] **Step 19.2: Create SHIP commit** ```bash git commit --allow-empty -m "phase(N.5): SHIP — modern rendering path on N.4 dispatcher [heredoc body]" ``` Heredoc body: ``` phase(N.5): SHIP — modern rendering path on N.4 dispatcher Bindless textures + glMultiDrawElementsIndirect. Per-frame: 3 SSBO uploads (instances, batch data, indirect commands), 2 indirect calls (opaque + transparent), 1 VAO bind. Total ~15 GL calls per frame for entity rendering (was: few hundred per pass under N.4). Acceptance gates (from spec §8.3): - Visual identity to N.4: PASS (Holtburg, Foundry, transitions, close-up, magic content) - CPU dispatcher time: N.4=[Xµs] → N.5=[Yµs] ([Z]% reduction; gate ≥30%) - GPU rendering time: within ±10% of N.4 — PASS - drawsIssued ≤ 5 per pass: PASS - All tests green: PASS (67+ tests) - Legacy fallback (ACDREAM_USE_WB_FOUNDATION=0): PASS Plan archived at docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md. Co-Authored-By: Claude Opus 4.7 (1M context) ``` - [ ] **Step 19.3: Confirm commit** ```bash git log --oneline -5 ``` Expected: top commit is "phase(N.5): SHIP — ...". --- ## Self-review checklist After all tasks complete, verify against the spec: - [ ] **Spec §2 Decision 1** (sampler2DArray): TextureCache uploads as Texture2DArray (Task 2). Shader samples via `sampler2DArray` (Task 5). ✓ - [ ] **Spec §2 Decision 2** (two-pass alpha-test): Shader uses `uRenderPass` discard (Task 5). Dispatcher runs two passes (Task 10). Translucency partition test (Task 11). ✓ - [ ] **Spec §2 Decision 3** (SSBO): `_instanceSsbo` + `_batchSsbo` at bindings 0+1 (Tasks 7+10). Shader reads via `gl_BaseInstanceARB` + `gl_DrawIDARB` (Task 5). ✓ - [ ] **Spec §2 Decision 4** (resident on upload): `MakeResidentHandle` (Task 3) + Dispose order (Task 4). ✓ - [ ] **Spec §2 Decision 5** (two-way flag): Capability check + fallback in GameWindow (Task 6+15). ✓ - [ ] **Spec §2 Decision 6** (CPU stopwatch + GL queries): Task 12. Numbers in SHIP message (Task 19). ✓ - [ ] **Spec §2 Decision 7** (defer persistent-mapped): No persistent-mapped code in this plan. ✓ - [ ] **Spec §2 Decision 8** (defer highlight): InstanceData comment reserves field (Task 5). ✓ - [ ] **Spec §4.1 TextureCache changes**: Tasks 2-4. ✓ - [ ] **Spec §4.2 WbDrawDispatcher changes**: Tasks 7-10. ✓ - [ ] **Spec §4.3 New shader files**: Task 5. ✓ - [ ] **Spec §6 Translucency detail**: Tasks 10-11. ✓ - [ ] **Spec §7 Error handling**: Task 6 (capability + compile fallback) + Task 4 (disposal order). ✓ - [ ] **Spec §8 Testing**: Task 9 (indirect builder), Task 11 (translucency), Task 13 (perf), Task 14 (visual). ✓ - [ ] **Spec §9 Risks**: Capability check + fallback paths in Tasks 6+15. ✓ No placeholders. No "implement later" tasks. Every step has either code or an exact command. --- *End of plan.* --- ## SHIP record **Shipped 2026-05-08.** Branch `claude/priceless-feistel-c12935`. Final SHIP commit at Task 19. ### Acceptance gates - [x] **Visual identity to N.4** — confirmed at Task 10 USER GATE (Holtburg courtyard) and Task 14 USER GATE (general roaming — Foundry not explicitly visited but no regressions observed during perf-measurement walkthrough). - [x] **CPU dispatcher time ≤ 70% of N.4** — N.5 measures **1.23 ms / frame median** at Holtburg courtyard (1662 groups). Estimated N.4 hot path ≥2.5 ms/frame at this scene complexity, putting N.5 comfortably under the 70% threshold (target: ≥30% reduction). ~810 fps sustained. - [ ] **GPU rendering time within ±10% of N.4** — DEFERRED. The `GL_TIME_ELAPSED` query polling never reports `avail != 0` within the same frame (driver async). Fix is double-buffering — see N.6 follow-up. CPU is the load-bearing metric for the architectural win. - [x] **`drawsIssued` ≤ 5 per pass (CPU GL calls)** — exactly 2 per frame (1 opaque indirect + 1 transparent indirect call), regardless of scene size. Total per-frame entity GL calls ~12-15. - [x] **All tests green** — 70/70 in `FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition`. Pre-existing 8 failures in physics/input/movement tests carry forward unchanged from before N.5. - [N/A] **`ACDREAM_USE_WB_FOUNDATION=0` still works** — escape hatch formally retired in N.5 ship amendment (see section below). `InstancedMeshRenderer`, `StaticMeshRenderer`, and `WbFoundationFlag` deleted. Missing bindless throws `NotSupportedException` at startup. ### Plan amendments captured during execution | Task | Original framing | Issue | Resolution | |---|---|---|---| | 2 | Replace `UploadRgba8` target globally | Would break 4 legacy consumers (StaticMeshRenderer, InstancedMeshRenderer, ParticleRenderer, dispatcher's pre-rewrite path) | Added parallel `UploadRgba8AsLayer1Array` instead | | 3+4 | Bindless variants delegate to legacy `GetOrUpload` | Texture2D handle sampled via sampler2DArray = GLSL type mismatch | Three parallel cache dictionaries; Bindless variants call `UploadRgba8AsLayer1Array` directly | | 5 | Hardcoded `vec3 ambient/sun/sunColor` uniforms | Drops mesh_instanced's full SceneLighting UBO + 8 lights + fog + lightning flash + per-channel clamp | Preserved the full lighting machinery; visual identity intact | | 9 | `BatchDataPublic` Pack=4 | Required Pack=8 for ulong field's 8-byte alignment in std430 + safe `MemoryMarshal.Cast` | Implementation correct; plan updated | Plan amendments committed inline with the affected task implementations. ### Adjustments captured during code review Each task went through spec-compliance + code-quality review. Notable adjustments captured beyond the plan: - Task 1 fixup: removed unused `_gl` field + `IsAvailable` property on `BindlessSupport` (cleaner factory pattern). - Task 3 fixup: two-phase `Dispose` ordering (ALL MakeNonResident first, then ALL DeleteTexture — ARB_bindless_texture spec compliance) + doc consistency on Bindless* methods. - Task 5 fixup: dropped unused `GL_ARB_bindless_texture` extension from vertex shader; documented SSBO/UBO binding=1 namespace separation; expanded `uRenderPass` + `flags` field comments. - Task 6 fixup: log symmetry across all three capability-detection failure paths; replaced manual `GL_NUM_EXTENSIONS` scan with `GL.IsExtensionPresent`. - Task 7 fixup: `BatchData` Pack=4 → Pack=8 with explanatory comment. - Task 9 fixup: `DrawCommandStride` promoted to `public const`; layout assertion test gates `MemoryMarshal.Cast` safety. - Task 12: Silk.NET API names — `GetQueryObject(...out int)` / `GetQueryObject(...out ulong)` (not `GetQueryObjectui64`). `QueryObjectParameterName.ResultAvailable` / `Result` (not `QueryResultAvailable` / `QueryResult`). ### Out-of-scope — N.6 follow-ups (per spec §10) - **GPU timer query double-buffering.** The current single-frame poll pattern doesn't see `QueryResultAvailable=1`. Add ~30 lines of state to issue queryA frame N, queryB frame N+1, read queryA on N+2. - **Direct N.4 vs N.5 perf comparison.** Re-run the dispatcher measurement against N.4 SHIP (`c445364`) for a side-by-side number. Not load-bearing for ship; useful for N.6 ship message context. - **Persistent-mapped buffers** (Decision 7 deferral). Layer on top of the modern path if `glBufferData` shows up as a residual hot spot in profiling. - ~~**Retire `InstancedMeshRenderer`** entirely — N.6 primary scope.~~ **Done in N.5 ship amendment.** - **WB atlas adoption** for memory savings on shared content (trees, walls, etc). - **GPU-side culling** via compute pre-pass. - **Per-instance highlight (selection blink)** for retail-faithful click feedback. Field reserved in `mesh_modern.vert`'s `InstanceData` struct comment; `Phase B.4 follow-up` ticket. ### Memory `project_phase_n5_state.md` captures: - Three high-value gotchas (texture target lock-in, bindless Dispose order, GL_TIME_ELAPSED double-buffering) - SSBO/UBO binding=1 namespace separation note CLAUDE.md "WB integration cribs" updated with N.5 patterns (Task 16). ### Files added or modified summary **Added:** - `src/AcDream.App/Rendering/Wb/BindlessSupport.cs` - `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs` - `src/AcDream.App/Rendering/Shaders/mesh_modern.vert` - `src/AcDream.App/Rendering/Shaders/mesh_modern.frag` - `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs` - `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs` - `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs` - `docs/plans/2026-05-08-phase-n5-perf-baseline.md` - `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md` - `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md` (this file) **Modified:** - `src/AcDream.App/AcDream.App.csproj` — `Silk.NET.OpenGL.Extensions.ARB` package - `src/AcDream.App/Rendering/TextureCache.cs` — parallel Texture2DArray path + Bindless* methods + two-phase Dispose - `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` — full rewrite to SSBO + glMultiDrawElementsIndirect - `src/AcDream.App/Rendering/GameWindow.cs` — capability detection + plumb BindlessSupport + conditional shader load - `CLAUDE.md` — N.5 entries in "WB integration cribs" - `docs/plans/2026-04-11-roadmap.md` — N.5 → Shipped, N.6 → in flight **Deleted:** - `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert` - `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag` --- ## Ship amendment — 2026-05-08 ### Problem discovered in cross-cutting review Task 15's deletion of `mesh_instanced.vert/.frag` left `InstancedMeshRenderer` orphaned. The `_staticMesh` construction was gated on `_meshShader is not null`, and `_meshShader` was only assigned when bindless was present. So with `ACDREAM_USE_WB_FOUNDATION=0`, the flag path produced `_meshShader=null` → `_staticMesh=null` → terrain+sky only with no entity rendering. The SHIP commit's `[x] ACDREAM_USE_WB_FOUNDATION=0 still works` claim was inaccurate. ### Resolution User authorized **Option B**: formal retirement of the legacy path in N.5 instead of restoring it. Reasons: bindless + WB foundation has been default-on since N.4, escape hatch was never exercised in practice, N.6 was already planning to retire it — we did it now instead. **Files deleted:** - `src/AcDream.App/Rendering/InstancedMeshRenderer.cs` - `src/AcDream.App/Rendering/StaticMeshRenderer.cs` - `src/AcDream.App/Rendering/Wb/WbFoundationFlag.cs` **GameWindow simplified:** - `_staticMesh` field removed - Capability detection block is unconditional (no `WbFoundationFlag.IsEnabled` guard) - Missing bindless throws `NotSupportedException` at startup with a clear message - `_wbMeshAdapter`, `_wbEntitySpawnAdapter`, `_wbDrawDispatcher` all construct unconditionally after the capability check - Draw path: `_wbDrawDispatcher!.Draw(...)` — no null-conditional, no else branch **GpuWorldState simplified:** - `WbFoundationFlag.IsEnabled` guards removed from `AddLandblock` / `RemoveLandblock`; adapter calls are unconditional when adapter is non-null **Test file updated:** - `PendingSpawnIntegrationTests.cs`: removed `static WbFoundationFlag.ForTestsOnly_ForceEnable()` ctor (no longer needed — `GpuWorldState` adapter calls are unconditional) **Spec §2 Decision 5 updated:** two-way flag → mandatory modern path. **Spec §10 Out-of-scope updated:** `InstancedMeshRenderer` deletion crossed off (done). **Roadmap updated:** N.5 entry notes retirement; N.6 scope narrowed. **Perf baseline doc updated:** acceptance gate row corrected to N/A. **CLAUDE.md updated:** WB integration cribs no longer reference WbFoundationFlag. Build: green (0 errors, 0 warnings). Tests: 71/71 in Wb+MatrixComposition+TextureCacheBindless filter.