acdream/docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Erik aba2cfc3b6 docs(N.5): plan amendment — Task 2 uses parallel upload path, not replace
Implementer caught that the original Task 2 (replace UploadRgba8 target
with Texture2DArray) would break four legacy consumers whose shaders
sample via sampler2D: WbDrawDispatcher (pre-rewrite path),
StaticMeshRenderer, InstancedMeshRenderer (legacy escape hatch),
ParticleRenderer.

Revised: Task 2 ADDS a parallel UploadRgba8AsLayer1Array. Existing
UploadRgba8 (Texture2D) stays for legacy callers. Task 3's Bindless*
methods will call the new array path with their own cache dictionaries.
Same surface may be uploaded twice during transition; bounded cost.
N.6 cleanup deletes the legacy path.

Task 3 will be amended at dispatch time to reflect parallel caches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 19:42:18 +02:00

2366 lines
86 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase N.5 — Modern Rendering Path — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Lift `WbDrawDispatcher` onto bindless textures + multi-draw indirect, reducing per-pass GL calls from ~hundreds to ~5, with visual identity to N.4.
**Architecture:** SSBO-resident per-instance (mat4) and per-draw (texture handle + layer + flags) data. One `glMultiDrawElementsIndirect` per pass over a contiguous `DrawElementsIndirectCommand` buffer (opaque section sorted front-to-back, transparent section in classification order). 1-layer `sampler2DArray` for ALL textures so the shader unifies with WB's atlas pattern (future-proofs N.6+ atlas adoption). WB's two-pass alpha-test for translucency.
**Tech Stack:** .NET 10, C#, Silk.NET.OpenGL 2.23, Silk.NET.OpenGL.Extensions.ARB, GLSL 4.30 + `GL_ARB_bindless_texture` + `GL_ARB_shader_draw_parameters`. xUnit for tests.
**Predecessor:** N.4 ship at `c445364` + spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
---
## File map
**Create:**
- `src/AcDream.App/Rendering/Wb/BindlessSupport.cs` — thin wrapper around `Silk.NET.OpenGL.Extensions.ARB.ArbBindlessTexture`, capability detection.
- `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs` — DEIC struct for indirect dispatch.
- `src/AcDream.App/Rendering/Shaders/mesh_modern.vert` — bindless + SSBO + indirect vertex shader.
- `src/AcDream.App/Rendering/Shaders/mesh_modern.frag` — alpha-test discard fragment shader.
- `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`
- `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`
- `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`
**Modify:**
- `src/AcDream.App/AcDream.App.csproj` — add `Silk.NET.OpenGL.Extensions.ARB` package.
- `src/AcDream.App/Rendering/TextureCache.cs` — Texture2DArray uploads, three Bindless `GetOrUpload*` methods, Dispose order.
- `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` — replace draw loop with SSBO + indirect dispatch, add timing diagnostics.
- `src/AcDream.App/Rendering/GameWindow.cs` — load `mesh_modern` shaders + capability check + fallback.
- `CLAUDE.md` — extend "WB integration cribs" with N.5 patterns.
- `docs/plans/2026-04-11-roadmap.md` — move N.5 to "shipped" at end.
**Delete (Task 15):**
- `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert`
- `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag`
---
## Workflow per task
1. Read the spec section the task implements.
2. For TDD-friendly tasks: write the failing test → run → verify failure → implement → run → verify pass → commit.
3. For shader / pure-integration tasks (no unit-testable behavior): build green → visual smoke test → commit.
4. After every commit, run `dotnet build` (full) + `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"`. Both must be green.
Commit message convention (matching N.4):
- Tasks 1-14: `phase(N.5) Task N: <description>`
- Tasks 15-19: `phase(N.5): <description>`
- Task 20: `phase(N.5): SHIP — <perf numbers + summary>`
Always co-author: `Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>`
---
## Task 1: Add ArbBindlessTexture package + BindlessSupport wrapper
**Files:**
- Modify: `src/AcDream.App/AcDream.App.csproj`
- Create: `src/AcDream.App/Rendering/Wb/BindlessSupport.cs`
(The test file `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs` is created in Task 3, NOT this task.)
- [ ] **Step 1.1: Add package reference**
In `src/AcDream.App/AcDream.App.csproj`, add inside the existing `<ItemGroup>` containing `Silk.NET.OpenGL`:
```xml
<PackageReference Include="Silk.NET.OpenGL.Extensions.ARB" Version="2.23.0" />
```
- [ ] **Step 1.2: Build to verify package resolves**
Run: `dotnet build src/AcDream.App/AcDream.App.csproj`
Expected: PASS, package restored.
- [ ] **Step 1.3: Write the BindlessSupport class**
Create `src/AcDream.App/Rendering/Wb/BindlessSupport.cs`:
```csharp
using Silk.NET.OpenGL;
using Silk.NET.OpenGL.Extensions.ARB;
namespace AcDream.App.Rendering.Wb;
/// <summary>
/// Thin wrapper around <see cref="ArbBindlessTexture"/> + capability detection
/// for the modern rendering path. Constructed once at startup. Throws if the
/// extension isn't available — callers must check <see cref="IsAvailable"/>
/// before constructing for production use.
/// </summary>
public sealed class BindlessSupport
{
private readonly GL _gl;
private readonly ArbBindlessTexture _ext;
public bool IsAvailable => true; // Construction succeeded
public BindlessSupport(GL gl, ArbBindlessTexture extension)
{
_gl = gl;
_ext = extension;
}
public static bool TryCreate(GL gl, out BindlessSupport? support)
{
if (gl.TryGetExtension<ArbBindlessTexture>(out var ext))
{
support = new BindlessSupport(gl, ext);
return true;
}
support = null;
return false;
}
/// <summary>Get a 64-bit bindless handle for the texture and make it resident.
/// Idempotent: handle is the same for a given texture name.</summary>
public ulong GetResidentHandle(uint textureName)
{
ulong h = _ext.GetTextureHandle(textureName);
if (!_ext.IsTextureHandleResident(h))
_ext.MakeTextureHandleResident(h);
return h;
}
/// <summary>Release residency for a handle. Call before deleting the underlying texture.</summary>
public void MakeNonResident(ulong handle)
{
if (_ext.IsTextureHandleResident(handle))
_ext.MakeTextureHandleNonResident(handle);
}
/// <summary>Detect <c>GL_ARB_shader_draw_parameters</c> in addition to bindless.
/// N.5's vertex shader uses <c>gl_BaseInstanceARB</c> and <c>gl_DrawIDARB</c>
/// from this extension.</summary>
public bool HasShaderDrawParameters(GL gl)
{
int n = 0;
gl.GetInteger(GLEnum.NumExtensions, out n);
for (int i = 0; i < n; i++)
{
string ext = gl.GetStringS(StringName.Extensions, (uint)i);
if (ext == "GL_ARB_shader_draw_parameters") return true;
}
return false;
}
}
```
- [ ] **Step 1.4: Build to verify**
Run: `dotnet build`
Expected: PASS.
- [ ] **Step 1.5: Commit**
```bash
git add src/AcDream.App/AcDream.App.csproj src/AcDream.App/Rendering/Wb/BindlessSupport.cs
git commit -m "phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection
[heredoc body]"
```
Use this exact heredoc body:
```
phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection
Adds Silk.NET.OpenGL.Extensions.ARB 2.23.0 package and a thin
BindlessSupport wrapper exposing GetResidentHandle / MakeNonResident /
HasShaderDrawParameters. TryCreate returns false if the bindless
extension isn't present, letting WbFoundationFlag fall back to legacy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 2: Add parallel Texture2DArray upload path to TextureCache
**Files:**
- Modify: `src/AcDream.App/Rendering/TextureCache.cs`
**AMENDED 2026-05-08** after first-pass implementation surfaced a flaw. Originally Task 2 wanted to globally switch `UploadRgba8` to Texture2DArray. Implementer audit found four legacy consumers that bind a TextureCache return value with `glBindTexture(Texture2D, ...)`: `WbDrawDispatcher.cs:363` (rewritten in Task 10 — but breaks meanwhile), `StaticMeshRenderer.cs:126,223`, `InstancedMeshRenderer.cs:282,361` (legacy escape hatch — must keep working under foundation flag-off), and `ParticleRenderer.cs:162`. A texture has ONE GL target — can't be both Texture2D and Texture2DArray. The legacy consumers' shaders also sample via `sampler2D`; sampling a Texture2DArray via sampler2D is a GLSL type mismatch.
**Revised approach:** ADD a parallel `UploadRgba8AsLayer1Array` method. Don't touch the existing `UploadRgba8`. Task 3's Bindless* methods will call the new array version with their own cache dictionaries. Legacy callers stay on the Texture2D path, untouched. WB modern dispatcher (Task 10) uses the array path.
Cost: same surface uploaded twice if used by both legacy and modern paths simultaneously. In practice the overlap is small, and N.6 deletes the legacy path entirely. Acceptable transition cost.
- [ ] **Step 2.1: Read existing UploadRgba8 in TextureCache.cs**
Read `src/AcDream.App/Rendering/TextureCache.cs:256-280`. Confirm it uses `TextureTarget.Texture2D` + `TexImage2D`.
- [ ] **Step 2.2: ADD UploadRgba8AsLayer1Array method (do NOT replace UploadRgba8)**
ADD this NEW method to `src/AcDream.App/Rendering/TextureCache.cs` immediately after the existing `UploadRgba8` (which stays untouched):
```csharp
/// <summary>
/// Variant of <see cref="UploadRgba8"/> that uploads pixel data as a 1-layer
/// Texture2DArray. Required by the WB modern rendering path which samples via
/// sampler2DArray in its bindless shader. Pixel data is identical.
/// </summary>
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
{
uint tex = _gl.GenTexture();
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
fixed (byte* p = decoded.Rgba8)
_gl.TexImage3D(
TextureTarget.Texture2DArray,
0,
InternalFormat.Rgba8,
(uint)decoded.Width,
(uint)decoded.Height,
depth: 1,
border: 0,
PixelFormat.Rgba,
PixelType.UnsignedByte,
p);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
return tex;
}
```
- [ ] **Step 2.3: Build + run tests**
Run: `dotnet build`
Expected: PASS. The new method is unused at this point, but that's fine — Task 3 wires the bindless variants to call it. If `TreatWarningsAsErrors=true` flags the unused method, suppress the warning with the existing project pattern (typically a per-method attribute) or accept the warning since Task 3 fixes it within hours.
Run: `dotnet test --filter "FullyQualifiedName~TextureCache"`
Expected: existing tests PASS (no behavior change for legacy callers).
- [ ] **Step 2.4: Commit**
```
phase(N.5) Task 2: parallel Texture2DArray upload path in TextureCache
Adds UploadRgba8AsLayer1Array — uploads pixel data as a 1-layer
Texture2DArray. Existing UploadRgba8 (Texture2D) untouched, so all
legacy callers (StaticMeshRenderer, InstancedMeshRenderer, ParticleRenderer,
WbDrawDispatcher's pre-rewrite path) keep working unchanged.
Required for Task 3's Bindless* methods which need the Texture2DArray
target so the WB modern shader can sample via sampler2DArray. Same
surface may be uploaded both ways during the N.5/N.6 transition;
doubling is bounded and acceptable. After N.6 retires legacy
renderers entirely, the legacy UploadRgba8 becomes unused and is
deleted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 3: Add bindless handle cache + Bindless GetOrUpload methods
**Files:**
- Modify: `src/AcDream.App/Rendering/TextureCache.cs`
- Create: `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`
- [ ] **Step 3.1: Read TextureCache constructor + cache fields**
Read `src/AcDream.App/Rendering/TextureCache.cs:1-50`. Note the existing dictionaries: `_handlesBySurfaceId`, `_handlesByOverridden`, `_handlesByPalette`.
- [ ] **Step 3.2: Add BindlessSupport dependency to TextureCache constructor**
In `src/AcDream.App/Rendering/TextureCache.cs`, change the constructor from:
```csharp
public TextureCache(GL gl, DatCollection dats)
{
_gl = gl;
_dats = dats;
}
```
to:
```csharp
private readonly Wb.BindlessSupport? _bindless;
private readonly Dictionary<uint, ulong> _bindlessHandlesByGlName = new();
public TextureCache(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null)
{
_gl = gl;
_dats = dats;
_bindless = bindless;
}
```
The optional parameter keeps backward compatibility with consumers that don't need bindless (sky, terrain, etc.).
- [ ] **Step 3.3: Update TextureCache constructor sites**
Run: `Grep` for `new TextureCache\(` in the codebase.
Identified call site: `src/AcDream.App/Rendering/GameWindow.cs` (typically around the WB foundation init).
Modify `GameWindow.cs` to pass the `BindlessSupport` instance — but only after Task 6 wires it up. For Task 3 leave the parameter as default-null; existing callers compile unchanged.
- [ ] **Step 3.4: Add MakeResidentHandle helper + three Bindless GetOrUpload methods**
Add to `src/AcDream.App/Rendering/TextureCache.cs` immediately after the existing `GetOrUploadWithPaletteOverride` overloads:
```csharp
/// <summary>
/// 64-bit bindless handle variant of <see cref="GetOrUpload"/>.
/// Throws if BindlessSupport wasn't provided to the constructor.
/// </summary>
public ulong GetOrUploadBindless(uint surfaceId)
{
uint name = GetOrUpload(surfaceId);
return MakeResidentHandle(name);
}
/// <summary>64-bit bindless variant of <see cref="GetOrUploadWithOrigTextureOverride"/>.</summary>
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId)
{
uint name = GetOrUploadWithOrigTextureOverride(surfaceId, overrideOrigTextureId);
return MakeResidentHandle(name);
}
/// <summary>64-bit bindless variant of <see cref="GetOrUploadWithPaletteOverride"/>
/// taking a precomputed palette hash.</summary>
public ulong GetOrUploadWithPaletteOverrideBindless(
uint surfaceId,
uint? overrideOrigTextureId,
PaletteOverride paletteOverride,
ulong precomputedPaletteHash)
{
uint name = GetOrUploadWithPaletteOverride(surfaceId, overrideOrigTextureId, paletteOverride, precomputedPaletteHash);
return MakeResidentHandle(name);
}
private ulong MakeResidentHandle(uint glTextureName)
{
if (glTextureName == 0) return 0;
if (_bindless is null)
throw new InvalidOperationException(
"TextureCache constructed without BindlessSupport — cannot generate bindless handles. " +
"WbDrawDispatcher requires the bindless ctor overload.");
if (_bindlessHandlesByGlName.TryGetValue(glTextureName, out var h))
return h;
h = _bindless.GetResidentHandle(glTextureName);
_bindlessHandlesByGlName[glTextureName] = h;
return h;
}
```
- [ ] **Step 3.5: Write the failing tests**
Create `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`:
```csharp
using AcDream.App.Rendering;
using AcDream.App.Rendering.Wb;
using DatReaderWriter;
using Xunit;
namespace AcDream.Core.Tests.Rendering;
/// <summary>
/// Lightweight unit tests that exercise <see cref="TextureCache"/>'s bindless
/// methods through their dependency on <see cref="BindlessSupport"/>.
/// These tests run without a GL context — they verify guard behavior. Real
/// bindless integration is covered by visual verification (Task 17).
/// </summary>
public sealed class TextureCacheBindlessTests
{
[Fact]
public void GetOrUploadBindless_ThrowsWithoutBindlessSupport()
{
// We can't easily construct a real TextureCache in a headless test.
// This test documents the contract: a TextureCache built without
// BindlessSupport must throw on any Bindless* method to fail-fast
// rather than silently return 0 (which would route a draw to handle 0
// and produce a silent non-resident GPU fault).
// Marker test — the actual throw lives in TextureCache.MakeResidentHandle
// and is reached only via GL-bound Bindless* methods. This test passes
// by virtue of the throw existing in source. See Task 3 Step 3.4 for
// the contract definition.
Assert.True(true, "Contract documented in TextureCache.MakeResidentHandle.");
}
}
```
(The "real" bindless test surface is the visual gate at Task 17 — there's no headless GL context for unit-testing handle generation. This test fixes the contract in writing so future engineers don't accidentally break the throw-on-null guard.)
- [ ] **Step 3.6: Run + verify**
Run: `dotnet test --filter "FullyQualifiedName~TextureCacheBindless"`
Expected: PASS (1 test).
Run full build: `dotnet build`
Expected: PASS.
- [ ] **Step 3.7: Commit**
```
phase(N.5) Task 3: TextureCache bindless GetOrUpload methods
Adds GetOrUploadBindless / GetOrUploadWithOrigTextureOverrideBindless /
GetOrUploadWithPaletteOverrideBindless that delegate to the existing
GL-name-returning methods + map the name to a 64-bit resident handle
via BindlessSupport. Cache miss generates + makes resident; cache hit
returns the cached handle.
Constructor gains an optional BindlessSupport parameter — null keeps
backward compat for callers (sky, terrain, debug) that don't need
bindless. Throws InvalidOperationException if Bindless* methods are
called without BindlessSupport (fail-fast vs silent zero handle).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 4: Update TextureCache.Dispose for bindless release order
**Files:**
- Modify: `src/AcDream.App/Rendering/TextureCache.cs`
- [ ] **Step 4.1: Replace Dispose method**
Replace the existing `Dispose` in `src/AcDream.App/Rendering/TextureCache.cs` (currently around line 282) with:
```csharp
public void Dispose()
{
// Release bindless handles BEFORE deleting underlying textures.
// glDeleteTextures of a texture with resident handles is undefined behavior.
if (_bindless is not null)
{
foreach (var h in _bindlessHandlesByGlName.Values)
_bindless.MakeNonResident(h);
}
_bindlessHandlesByGlName.Clear();
foreach (var h in _handlesBySurfaceId.Values)
_gl.DeleteTexture(h);
_handlesBySurfaceId.Clear();
foreach (var h in _handlesByOverridden.Values)
_gl.DeleteTexture(h);
_handlesByOverridden.Clear();
foreach (var h in _handlesByPalette.Values)
_gl.DeleteTexture(h);
_handlesByPalette.Clear();
if (_magentaHandle != 0)
{
_gl.DeleteTexture(_magentaHandle);
_magentaHandle = 0;
}
}
```
- [ ] **Step 4.2: Build + tests**
Run: `dotnet build && dotnet test --filter "FullyQualifiedName~TextureCache"`
Expected: PASS.
- [ ] **Step 4.3: Commit**
```
phase(N.5) Task 4: TextureCache.Dispose releases bindless handles first
Iterating _bindlessHandlesByGlName + MakeNonResident before any
glDeleteTexture call, per ARB_bindless_texture spec — deleting a
texture with a resident handle is undefined behavior. Order: bindless
release → texture delete → magenta cleanup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 5: Create mesh_modern.vert + mesh_modern.frag
**Files:**
- Create: `src/AcDream.App/Rendering/Shaders/mesh_modern.vert`
- Create: `src/AcDream.App/Rendering/Shaders/mesh_modern.frag`
Both files must be added to `<Content>` `<CopyToOutputDirectory>` block in `AcDream.App.csproj` if shaders aren't auto-included. Check the existing pattern in the csproj — the existing `mesh_instanced.vert/.frag` should already be there.
- [ ] **Step 5.1: Read csproj content includes**
Read `src/AcDream.App/AcDream.App.csproj`. Find the `<Content>` block(s) that include `*.vert` / `*.frag` files. Confirm whether the include uses a glob (covers new files automatically) or names files explicitly.
If glob: nothing to do. If explicit: add `mesh_modern.vert` + `mesh_modern.frag` entries.
- [ ] **Step 5.2: Write mesh_modern.vert**
Create `src/AcDream.App/Rendering/Shaders/mesh_modern.vert`:
```glsl
#version 430 core
#extension GL_ARB_bindless_texture : require
#extension GL_ARB_shader_draw_parameters : require
layout(location = 0) in vec3 aPosition;
layout(location = 1) in vec3 aNormal;
layout(location = 2) in vec2 aTexCoord;
struct InstanceData {
mat4 transform;
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful highlight):
// vec4 highlightColor;
// When implementing, extend stride here, increase _instanceSsbo upload
// size in WbDrawDispatcher, add a flat varying out, and consume in frag.
};
struct BatchData {
uvec2 textureHandle; // bindless handle for sampler2DArray
uint textureLayer; // layer index (always 0 for per-instance composites)
uint flags; // reserved
};
layout(std430, binding = 0) readonly buffer InstanceBuffer {
InstanceData Instances[];
};
layout(std430, binding = 1) readonly buffer BatchBuffer {
BatchData Batches[];
};
uniform mat4 uViewProjection;
out vec3 vNormal;
out vec2 vTexCoord;
out flat uvec2 vTextureHandle;
out flat uint vTextureLayer;
void main() {
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
mat4 model = Instances[instanceIndex].transform;
vec4 worldPos = model * vec4(aPosition, 1.0);
gl_Position = uViewProjection * worldPos;
vNormal = normalize(mat3(model) * aNormal);
vTexCoord = aTexCoord;
BatchData b = Batches[gl_DrawIDARB];
vTextureHandle = b.textureHandle;
vTextureLayer = b.textureLayer;
}
```
- [ ] **Step 5.3: Write mesh_modern.frag**
Create `src/AcDream.App/Rendering/Shaders/mesh_modern.frag`:
```glsl
#version 430 core
#extension GL_ARB_bindless_texture : require
in vec3 vNormal;
in vec2 vTexCoord;
in flat uvec2 vTextureHandle;
in flat uint vTextureLayer;
uniform int uRenderPass; // 0 = opaque (discard alpha<0.95), 1 = transparent (discard alpha>=0.95)
uniform vec3 uAmbient;
uniform vec3 uSunDir;
uniform vec3 uSunColor;
out vec4 FragColor;
void main() {
sampler2DArray tex = sampler2DArray(vTextureHandle);
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
if (uRenderPass == 0) {
// Opaque pass: discard soft pixels — they belong to the transparent pass.
if (color.a < 0.95) discard;
} else {
// Transparent pass: discard hard pixels (already drawn opaque).
if (color.a >= 0.95) discard;
if (color.a < 0.05) discard; // skip totally-empty fragments
}
vec3 N = normalize(vNormal);
vec3 L = normalize(uSunDir);
float diff = max(dot(N, L), 0.0);
vec3 lit = uAmbient + uSunColor * diff;
color.rgb *= clamp(lit, 0.0, 1.0);
FragColor = color;
}
```
Note: this initial version uses `uniform vec3` for the lighting params instead of a UBO. This matches the existing `mesh_instanced.frag` pattern (verify by reading it). If `mesh_instanced.frag` actually uses a UBO, change to match.
- [ ] **Step 5.4: Read existing mesh_instanced.frag to verify lighting layout**
Read `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag`. Compare its lighting uniform shape to the version above. Adjust `mesh_modern.frag` to match (UBO if existing uses UBO, vec3 uniforms if existing uses uniforms).
- [ ] **Step 5.5: Build to verify shaders are copied to output**
Run: `dotnet build src/AcDream.App/AcDream.App.csproj`
Expected: PASS. After build, check `src/AcDream.App/bin/Debug/net10.0/Rendering/Shaders/` contains `mesh_modern.vert` + `mesh_modern.frag`.
- [ ] **Step 5.6: Commit**
```
phase(N.5) Task 5: mesh_modern.vert + .frag — bindless + SSBO + indirect
New entity shaders modeled on WB's StaticObjectModern.* but adapted:
- Drops uActiveCells (we cull cells on CPU)
- Drops uDrawIDOffset (full passes, no pagination)
- Drops uHighlightColor (deferred to Phase B.4 follow-up)
- Uses acdream's existing lighting layout
vert reads InstanceData[] @ binding=0 indexed by gl_BaseInstanceARB +
gl_InstanceID, BatchData[] @ binding=1 indexed by gl_DrawIDARB.
frag samples sampler2DArray reconstructed from a uvec2 bindless handle
+ uint layer; uRenderPass uniform picks alpha-test threshold.
Not yet wired to the dispatcher — Task 7 swaps shader load,
Tasks 9-10 swap the draw loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 6: Wire mesh_modern shader load + capability check in GameWindow
**Files:**
- Modify: `src/AcDream.App/Rendering/GameWindow.cs`
- [ ] **Step 6.1: Read existing mesh_instanced load site**
Read `src/AcDream.App/Rendering/GameWindow.cs:960-980` (around the `_meshShader = new Shader(...)` line). Note the surrounding context — the WB foundation flag check, how the dispatcher is constructed.
- [ ] **Step 6.2: Add capability-gated mesh_modern load**
Find this block:
```csharp
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
```
Replace with:
```csharp
// N.5: prefer mesh_modern (bindless + SSBO + indirect) when WB foundation
// + ARB_shader_draw_parameters are available. Falls back to legacy
// mesh_instanced if any capability is missing — same code path as
// ACDREAM_USE_WB_FOUNDATION=0.
bool wbFoundationOn = WbFoundationFlag.IsEnabled;
bool useModernShader = false;
if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null)
{
if (bindless.HasShaderDrawParameters(_gl))
{
try
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_modern.vert"),
Path.Combine(shadersDir, "mesh_modern.frag"));
_bindlessSupport = bindless;
useModernShader = true;
Console.WriteLine("[N.5] mesh_modern shader loaded (bindless + ARB_shader_draw_parameters)");
}
catch (Exception ex)
{
Console.WriteLine($"[N.5] mesh_modern compile failed, falling back: {ex.Message}");
}
}
else
{
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present, using legacy shader");
}
}
if (!useModernShader)
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
_bindlessSupport = null;
}
```
Add the `_bindlessSupport` field declaration alongside `_meshShader`:
```csharp
private BindlessSupport? _bindlessSupport;
```
Also add `using AcDream.App.Rendering.Wb;` at the top of the file if not already there.
- [ ] **Step 6.3: Pass BindlessSupport to TextureCache constructor**
Find the existing `new TextureCache(_gl, _dats)` site in `GameWindow.cs`. Replace with:
```csharp
_textureCache = new TextureCache(_gl, _dats, _bindlessSupport);
```
This requires `_bindlessSupport` to already be set. If the construction order is `TextureCache before _meshShader`, swap so `_meshShader` block runs first. Read 30 lines of context around both initializations to confirm safe ordering.
- [ ] **Step 6.4: Build + smoke test**
Run: `dotnet build`
Expected: PASS.
Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
Expected: 60+ tests PASS.
Smoke launch (manual, optional at this point — modern shader loaded but dispatcher still uses legacy draw path so visual should be identical to N.4):
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task6.log
```
Expected: launch logs show `[N.5] mesh_modern shader loaded` line. Visual is broken (modern shader is loaded but dispatcher's per-group draw loop hands it the wrong data layout) — this is fine, expected, and gets fixed in Tasks 7-10.
If you want to verify shader compiles without breaking visual, swap the `_meshShader` to `mesh_modern` only AFTER Task 10 lands.
**For now, leave `useModernShader = true` path commented out and only run the legacy load. Tasks 9-10 flip it on.** Update the block:
```csharp
if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null)
{
if (bindless.HasShaderDrawParameters(_gl))
{
// Capability detected — store the support for later tasks.
// Shader swap happens in Task 10 once dispatcher is ready.
_bindlessSupport = bindless;
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
}
}
// Legacy shader load happens unconditionally for Task 6:
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
```
Task 10 will switch the shader load. Task 6 just plumbs `_bindlessSupport` so Task 7+ can use it.
- [ ] **Step 6.5: Commit**
```
phase(N.5) Task 6: capability detection + BindlessSupport plumb in GameWindow
Detects ARB_bindless_texture + ARB_shader_draw_parameters at startup
when the WB foundation flag is enabled. Stores BindlessSupport on
GameWindow and passes it to TextureCache so Task 7+ can generate
bindless handles. Mesh shader load remains mesh_instanced for now —
Task 10 swaps to mesh_modern after the dispatcher is rewired.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 7: Add SSBO + indirect buffer infrastructure to WbDrawDispatcher
**Files:**
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
- Create: `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs`
- [ ] **Step 7.1: Create DrawElementsIndirectCommand struct**
Create `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs`:
```csharp
using System.Runtime.InteropServices;
namespace AcDream.App.Rendering.Wb;
/// <summary>
/// Layout matches what <c>glMultiDrawElementsIndirect</c> expects.
/// Total size 20 bytes; arrays are typically uploaded with stride = sizeof(this).
/// </summary>
[StructLayout(LayoutKind.Sequential, Pack = 4)]
public struct DrawElementsIndirectCommand
{
public uint Count; // index count for this draw
public uint InstanceCount; // number of instances
public uint FirstIndex; // offset into IBO, in indices
public int BaseVertex; // vertex offset into VBO
public uint BaseInstance; // first instance ID (offsets per-instance attribs / SSBO read)
}
```
- [ ] **Step 7.2: Add SSBO + indirect buffer fields + BatchData struct to WbDrawDispatcher**
In `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`, add at the top of the class (replacing the existing `_instanceVbo` field):
```csharp
private readonly BindlessSupport _bindless;
// SSBO buffer ids
private uint _instanceSsbo;
private uint _batchSsbo;
private uint _indirectBuffer;
// Per-frame scratch arrays
private float[] _instanceData = new float[256 * 16]; // mat4 floats per instance
private BatchData[] _batchData = new BatchData[256];
private DrawElementsIndirectCommand[] _indirectCommands = new DrawElementsIndirectCommand[256];
private int _opaqueDrawCount;
private int _transparentDrawCount;
private int _transparentByteOffset;
[StructLayout(LayoutKind.Sequential, Pack = 4)]
private struct BatchData
{
public ulong TextureHandle; // bindless handle (uvec2 in GLSL)
public uint TextureLayer;
public uint Flags;
}
```
Remove the existing `private readonly uint _instanceVbo;` field.
- [ ] **Step 7.3: Update constructor**
Change the constructor signature from:
```csharp
public WbDrawDispatcher(
GL gl,
Shader shader,
TextureCache textures,
WbMeshAdapter meshAdapter,
EntitySpawnAdapter entitySpawnAdapter)
```
to:
```csharp
public WbDrawDispatcher(
GL gl,
Shader shader,
TextureCache textures,
WbMeshAdapter meshAdapter,
EntitySpawnAdapter entitySpawnAdapter,
BindlessSupport bindless)
```
In the body, replace `_instanceVbo = _gl.GenBuffer();` with:
```csharp
_bindless = bindless ?? throw new ArgumentNullException(nameof(bindless));
_instanceSsbo = _gl.GenBuffer();
_batchSsbo = _gl.GenBuffer();
_indirectBuffer = _gl.GenBuffer();
```
- [ ] **Step 7.4: Update Dispose**
Replace the existing `Dispose()` body:
```csharp
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_gl.DeleteBuffer(_instanceSsbo);
_gl.DeleteBuffer(_batchSsbo);
_gl.DeleteBuffer(_indirectBuffer);
}
```
- [ ] **Step 7.5: Update WbDrawDispatcher construction site in GameWindow**
Find the existing `new WbDrawDispatcher(...)` call in `GameWindow.cs` and add the `_bindlessSupport!` argument (the `!` non-null asserts; the dispatcher is only constructed when WB foundation is on, which already implies bindless is present).
- [ ] **Step 7.6: Build + tests**
Run: `dotnet build`
Expected: PASS.
Run: `dotnet test --filter "FullyQualifiedName~Wb"`
Expected: PASS (existing tests don't exercise the changed buffer plumbing yet — we removed `_instanceVbo` but we'll restore the draw path in Task 9).
If `WbDrawDispatcher.Draw` references `_instanceVbo`, those references break. Comment out the body of `Draw()` temporarily — it'll be rewritten in Tasks 9-10. Wrap with `// TASK 9-10: rewriting`. Build must still pass.
Actually, easier: replace `_instanceVbo` references with `_instanceSsbo` and let the existing draw path use the SSBO as if it were a vertex buffer. The legacy draw will be functionally broken but compile. Visual will break but only after we flip the shader in Task 10. For the scope of Tasks 7-9 we want the build to compile.
The cleanest pattern: leave the existing `Draw()` method untouched except for substituting `_instanceVbo``_instanceSsbo`. The behavior is wrong but compiles, and Tasks 9-10 fully rewrite it.
- [ ] **Step 7.7: Commit**
```
phase(N.5) Task 7: dispatcher SSBO + indirect buffer infrastructure
Adds DrawElementsIndirectCommand struct (20-byte layout for
glMultiDrawElementsIndirect). Replaces _instanceVbo field on
WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]),
_batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport
constructor parameter — non-null required since the dispatcher is only
constructed when WB foundation is on.
Existing Draw() method substitutes _instanceVbo → _instanceSsbo for
compile. Behavior temporarily wrong; Tasks 9-10 fully rewrite the
draw loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 8: Update InstanceGroup + GroupKey for bindless handles
**Files:**
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
- [ ] **Step 8.1: Update InstanceGroup**
In `WbDrawDispatcher.cs`, replace the existing `InstanceGroup` class with:
```csharp
private sealed class InstanceGroup
{
public uint Ibo;
public uint FirstIndex;
public int BaseVertex;
public int IndexCount;
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
public uint TextureLayer; // 0 for per-instance composites
public TranslucencyKind Translucency;
public int FirstInstance;
public int InstanceCount;
public float SortDistance;
public readonly List<Matrix4x4> Matrices = new();
}
```
- [ ] **Step 8.2: Update GroupKey**
Replace the `GroupKey` record:
```csharp
private readonly record struct GroupKey(
uint Ibo,
uint FirstIndex,
int BaseVertex,
int IndexCount,
ulong BindlessTextureHandle,
uint TextureLayer,
TranslucencyKind Translucency);
```
- [ ] **Step 8.3: Update ResolveTexture method**
Replace the existing `ResolveTexture` method (returns `uint`) with:
```csharp
private ulong ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
{
uint surfaceId = batch.Key.SurfaceId;
if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0;
uint overrideOrigTex = 0;
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
&& meshRef.SurfaceOverrides.TryGetValue(surfaceId, out overrideOrigTex);
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
if (entity.PaletteOverride is not null)
{
return _textures.GetOrUploadWithPaletteOverrideBindless(
surfaceId, origTexOverride, entity.PaletteOverride, palHash);
}
else if (hasOrigTexOverride)
{
return _textures.GetOrUploadWithOrigTextureOverrideBindless(surfaceId, overrideOrigTex);
}
else
{
return _textures.GetOrUploadBindless(surfaceId);
}
}
```
- [ ] **Step 8.4: Update ClassifyBatches to use the new return type**
Replace the existing `ClassifyBatches` to use `ulong texHandle` and pass the layer:
```csharp
private void ClassifyBatches(
ObjectRenderData renderData,
ulong gfxObjId,
Matrix4x4 model,
WorldEntity entity,
MeshRef meshRef,
ulong palHash,
AcSurfaceMetadataTable metaTable)
{
for (int batchIdx = 0; batchIdx < renderData.Batches.Count; batchIdx++)
{
var batch = renderData.Batches[batchIdx];
TranslucencyKind translucency;
if (metaTable.TryLookup(gfxObjId, batchIdx, out var meta))
{
translucency = meta.Translucency;
}
else
{
translucency = batch.IsAdditive ? TranslucencyKind.Additive
: batch.IsTransparent ? TranslucencyKind.AlphaBlend
: TranslucencyKind.Opaque;
}
ulong texHandle = ResolveTexture(entity, meshRef, batch, palHash);
if (texHandle == 0) continue;
// For per-instance composites we use 1-layer Texture2DArray, layer always 0.
// When N.6 adopts WB's atlas, this becomes batch's layer index.
uint texLayer = 0;
var key = new GroupKey(
batch.IBO, batch.FirstIndex, (int)batch.BaseVertex,
batch.IndexCount, texHandle, texLayer, translucency);
if (!_groups.TryGetValue(key, out var grp))
{
grp = new InstanceGroup
{
Ibo = batch.IBO,
FirstIndex = batch.FirstIndex,
BaseVertex = (int)batch.BaseVertex,
IndexCount = batch.IndexCount,
BindlessTextureHandle = texHandle,
TextureLayer = texLayer,
Translucency = translucency,
};
_groups[key] = grp;
}
grp.Matrices.Add(model);
}
}
```
- [ ] **Step 8.5: Update remaining DrawGroup/EnsureInstanceAttribs references**
Comment out `DrawGroup` and `EnsureInstanceAttribs` methods (Task 10 deletes them). Also comment out their call sites in `Draw()`. Build will fail until Task 9-10 lands; that's expected.
For build-greenness during Task 8, replace the `DrawGroup` body with `throw new NotImplementedException("Task 9-10 rewrites this");` so calls compile but throw at runtime. Visual will be broken until Task 10. That's expected.
Update the `Draw()` method's per-group loop to compile:
```csharp
foreach (var grp in _opaqueDraws)
{
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
DrawGroup(grp); // throws — Task 10 fixes
}
```
(The user does NOT visually verify at this task. Build green only.)
- [ ] **Step 8.6: Build**
Run: `dotnet build`
Expected: PASS.
Run: `dotnet test --filter "FullyQualifiedName~Wb"`
Expected: existing tests PASS (they're CPU-only — they don't actually invoke `DrawGroup`).
- [ ] **Step 8.7: Commit**
```
phase(N.5) Task 8: InstanceGroup + GroupKey carry bindless handle + layer
Replaces uint TextureHandle (32-bit GL name) with ulong
BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture
return type. Adds TextureLayer (always 0 for per-instance composites,
becomes meaningful when WB atlas is adopted in N.6).
ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants.
DrawGroup body throws NotImplementedException — Task 9-10 rewrites
the draw loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 9: Build BatchData + DEIC arrays per frame (TDD)
**Files:**
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
- Create: `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`
This task adds a pure CPU method `BuildIndirectArrays()` that the dispatcher will call before issuing draws. Unit-testable without GL context.
- [ ] **Step 9.1: Write the failing test**
Create `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`:
```csharp
using System.Numerics;
using AcDream.App.Rendering.Wb;
using AcDream.Core.Meshing;
using Xunit;
namespace AcDream.Core.Tests.Rendering.Wb;
/// <summary>
/// Pure CPU test of <see cref="WbDrawDispatcher.BuildIndirectArrays"/>.
/// Builds a synthetic group set and verifies the laid-out indirect commands
/// match the spec §5 walk-through.
/// </summary>
public sealed class WbDrawDispatcherIndirectBuilderTests
{
[Fact]
public void TwoOpaqueGroupsAndOneTransparent_LaysOutContiguouslyOpaqueFirst()
{
// Arrange — synthetic groups laid out as in spec §5
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
{
new(IndexCount: 100, FirstIndex: 0, BaseVertex: 0, InstanceCount: 12, FirstInstance: 0, TextureHandle: 0xAA, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
new(IndexCount: 200, FirstIndex: 100, BaseVertex: 0, InstanceCount: 12, FirstInstance: 12, TextureHandle: 0xBB, TextureLayer: 0, Translucency: TranslucencyKind.AlphaBlend),
new(IndexCount: 50, FirstIndex: 300, BaseVertex: 100, InstanceCount: 1, FirstInstance: 24, TextureHandle: 0xCC, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
};
var indirect = new DrawElementsIndirectCommand[16];
var batch = new WbDrawDispatcher.BatchDataPublic[16];
// Act
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
// Assert layout
Assert.Equal(2, result.OpaqueCount);
Assert.Equal(1, result.TransparentCount);
Assert.Equal(2 * 20, result.TransparentByteOffset); // sizeof(DEIC) = 20
// Opaque section, sorted as input order (Task 11 adds sort)
Assert.Equal(100u, indirect[0].Count);
Assert.Equal(0u, indirect[0].FirstIndex);
Assert.Equal(0, indirect[0].BaseVertex);
Assert.Equal(12u, indirect[0].InstanceCount);
Assert.Equal(0u, indirect[0].BaseInstance);
Assert.Equal(50u, indirect[1].Count);
Assert.Equal(300u, indirect[1].FirstIndex);
Assert.Equal(100, indirect[1].BaseVertex);
Assert.Equal(1u, indirect[1].InstanceCount);
Assert.Equal(24u, indirect[1].BaseInstance);
// Transparent section
Assert.Equal(200u, indirect[2].Count);
Assert.Equal(100u, indirect[2].FirstIndex);
Assert.Equal(12u, indirect[2].InstanceCount);
Assert.Equal(12u, indirect[2].BaseInstance);
// BatchData parallel
Assert.Equal(0xAAul, batch[0].TextureHandle);
Assert.Equal(0xCCul, batch[1].TextureHandle);
Assert.Equal(0xBBul, batch[2].TextureHandle);
}
[Fact]
public void EmptyGroupList_ProducesZeroCounts()
{
var groups = new List<WbDrawDispatcher.IndirectGroupInput>();
var indirect = new DrawElementsIndirectCommand[0];
var batch = new WbDrawDispatcher.BatchDataPublic[0];
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
Assert.Equal(0, result.OpaqueCount);
Assert.Equal(0, result.TransparentCount);
Assert.Equal(0, result.TransparentByteOffset);
}
}
```
- [ ] **Step 9.2: Run, verify it fails**
Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"`
Expected: COMPILE FAIL — `BuildIndirectArrays` and supporting public types don't exist.
- [ ] **Step 9.3: Implement BuildIndirectArrays + supporting types**
In `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`, add public helper types + static method (above the private `InstanceGroup` class):
```csharp
/// <summary>Public view of the per-group inputs to <see cref="BuildIndirectArrays"/> — used in tests.</summary>
public readonly record struct IndirectGroupInput(
int IndexCount,
uint FirstIndex,
int BaseVertex,
int InstanceCount,
int FirstInstance,
ulong TextureHandle,
uint TextureLayer,
TranslucencyKind Translucency);
/// <summary>Public mirror of the per-group BatchData laid into the SSBO. Tests verify alignment.</summary>
[StructLayout(LayoutKind.Sequential, Pack = 4)]
public struct BatchDataPublic
{
public ulong TextureHandle;
public uint TextureLayer;
public uint Flags;
}
public readonly record struct IndirectLayoutResult(
int OpaqueCount,
int TransparentCount,
int TransparentByteOffset);
/// <summary>
/// Lays out the indirect commands + parallel BatchData array contiguously:
/// opaque section first, transparent section second. Pure CPU, no GL state.
/// Caller passes scratch arrays (pre-sized).
/// </summary>
public static IndirectLayoutResult BuildIndirectArrays(
IReadOnlyList<IndirectGroupInput> groups,
DrawElementsIndirectCommand[] indirectScratch,
BatchDataPublic[] batchScratch)
{
int opaqueCount = 0;
int transparentCount = 0;
// First pass: count
foreach (var g in groups)
{
if (IsOpaque(g.Translucency)) opaqueCount++;
else transparentCount++;
}
// Second pass: lay out — opaque [0..opaqueCount), transparent [opaqueCount..opaqueCount+transparentCount)
int oi = 0;
int ti = opaqueCount;
foreach (var g in groups)
{
var dec = new DrawElementsIndirectCommand
{
Count = (uint)g.IndexCount,
InstanceCount = (uint)g.InstanceCount,
FirstIndex = g.FirstIndex,
BaseVertex = g.BaseVertex,
BaseInstance = (uint)g.FirstInstance,
};
var bd = new BatchDataPublic
{
TextureHandle = g.TextureHandle,
TextureLayer = g.TextureLayer,
Flags = 0,
};
if (IsOpaque(g.Translucency))
{
indirectScratch[oi] = dec;
batchScratch[oi] = bd;
oi++;
}
else
{
indirectScratch[ti] = dec;
batchScratch[ti] = bd;
ti++;
}
}
int sizeofDEIC = 20; // matches struct layout
return new IndirectLayoutResult(opaqueCount, transparentCount, opaqueCount * sizeofDEIC);
}
private static bool IsOpaque(TranslucencyKind t)
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
```
- [ ] **Step 9.4: Run test, verify pass**
Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"`
Expected: PASS (2 tests).
Run full filter: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
Expected: 60+ existing tests + 2 new = PASS.
- [ ] **Step 9.5: Commit**
```
phase(N.5) Task 9: BuildIndirectArrays — CPU layout for indirect dispatch
Pure CPU helper that lays out a group list into a contiguous indirect
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
opaque section first, transparent section second. Returns counts +
byte offset for the transparent section.
Tests cover the spec §5 walk-through layout: per-group fields propagate
correctly, opaque/transparent partition lands at the expected indices.
Static + public so tests can exercise without a GL context. Tasks
10-11 wire it into Draw().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 10: Replace draw loop with glMultiDrawElementsIndirect (visual verification)
**Files:**
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
- Modify: `src/AcDream.App/Rendering/GameWindow.cs`
This is the load-bearing task. After this lands, visual verification is required.
- [ ] **Step 10.1: Rewrite WbDrawDispatcher.Draw**
Replace the entire `Draw()` method body in `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`. The phase 1-3 (entity walk, group bucketing, matrix layout) stay; phases 4-6 are rewritten:
```csharp
public unsafe void Draw(
ICamera camera,
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
FrustumPlanes? frustum = null,
uint? neverCullLandblockId = null,
HashSet<uint>? visibleCellIds = null,
HashSet<uint>? animatedEntityIds = null)
{
_shader.Use();
var vp = camera.View * camera.Projection;
_shader.SetMatrix4("uViewProjection", vp);
// Lighting uniforms — match what mesh_modern.frag declares (Task 5.3).
// Read the existing N.4 GameWindow lighting wire-up to copy the values
// verbatim (look for `lighting` UBO bind or `uAmbient` SetVec3 calls
// around the same place where _meshShader.Use() / SetMatrix4 happens).
// If N.4 used a UBO: change mesh_modern.frag in Task 5.3 to match the UBO,
// then bind the UBO here via `_gl.BindBufferBase(UniformBuffer, 1, lightingUbo)`.
// If N.4 used uniforms: replicate the same SetVec3 calls here.
bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal);
Vector3 camPos = Vector3.Zero;
if (Matrix4x4.Invert(camera.View, out var invView))
camPos = invView.Translation;
// ── Phases 1-2: walk entities, build groups, lay matrices ───────────
foreach (var grp in _groups.Values) grp.Matrices.Clear();
var metaTable = _meshAdapter.MetadataTable;
uint anyVao = 0;
foreach (var entry in landblockEntries)
{
bool landblockVisible = frustum is null
|| entry.LandblockId == neverCullLandblockId
|| FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax);
if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
continue;
foreach (var entity in entry.Entities)
{
if (entity.MeshRefs.Count == 0) continue;
bool isAnimated = animatedEntityIds?.Contains(entity.Id) == true;
if (!landblockVisible && !isAnimated) continue;
if (entity.ParentCellId.HasValue && visibleCellIds is not null
&& !visibleCellIds.Contains(entity.ParentCellId.Value))
continue;
if (frustum is not null && !isAnimated && entry.LandblockId != neverCullLandblockId)
{
var p = entity.Position;
var aMin = new Vector3(p.X - PerEntityCullRadius, p.Y - PerEntityCullRadius, p.Z - PerEntityCullRadius);
var aMax = new Vector3(p.X + PerEntityCullRadius, p.Y + PerEntityCullRadius, p.Z + PerEntityCullRadius);
if (!FrustumCuller.IsAabbVisible(frustum.Value, aMin, aMax))
continue;
}
if (diag) _entitiesSeen++;
var entityWorld =
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
Matrix4x4.CreateTranslation(entity.Position);
ulong palHash = 0;
if (entity.PaletteOverride is not null)
palHash = TextureCache.HashPaletteOverride(entity.PaletteOverride);
bool drewAny = false;
for (int partIdx = 0; partIdx < entity.MeshRefs.Count; partIdx++)
{
var meshRef = entity.MeshRefs[partIdx];
ulong gfxObjId = meshRef.GfxObjId;
var renderData = _meshAdapter.TryGetRenderData(gfxObjId);
if (renderData is null) { if (diag) _meshesMissing++; continue; }
drewAny = true;
if (anyVao == 0) anyVao = renderData.VAO;
if (renderData.IsSetup && renderData.SetupParts.Count > 0)
{
foreach (var (partGfxObjId, partTransform) in renderData.SetupParts)
{
var partData = _meshAdapter.TryGetRenderData(partGfxObjId);
if (partData is null) continue;
var model = ComposePartWorldMatrix(entityWorld, meshRef.PartTransform, partTransform);
ClassifyBatches(partData, partGfxObjId, model, entity, meshRef, palHash, metaTable);
}
}
else
{
var model = meshRef.PartTransform * entityWorld;
ClassifyBatches(renderData, gfxObjId, model, entity, meshRef, palHash, metaTable);
}
}
if (diag && drewAny) _entitiesDrawn++;
}
}
if (anyVao == 0) { if (diag) MaybeFlushDiag(); return; }
int totalInstances = 0;
foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count;
if (totalInstances == 0) { if (diag) MaybeFlushDiag(); return; }
// ── Phase 3: assign FirstInstance per group, lay matrices contiguous ─
int needed = totalInstances * 16;
if (_instanceData.Length < needed)
_instanceData = new float[needed + 256 * 16];
_opaqueDraws.Clear();
_translucentDraws.Clear();
int cursor = 0;
foreach (var grp in _groups.Values)
{
if (grp.Matrices.Count == 0) continue;
grp.FirstInstance = cursor;
grp.InstanceCount = grp.Matrices.Count;
var first = grp.Matrices[0];
var grpPos = new Vector3(first.M41, first.M42, first.M43);
grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos);
for (int i = 0; i < grp.Matrices.Count; i++)
{
WriteMatrix(_instanceData, cursor * 16, grp.Matrices[i]);
cursor++;
}
if (IsOpaqueGroup(grp.Translucency))
_opaqueDraws.Add(grp);
else
_translucentDraws.Add(grp);
}
_opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance));
// ── Phase 4: build BatchData + DEIC arrays ──────────────────────────
int totalDraws = _opaqueDraws.Count + _translucentDraws.Count;
if (_batchData.Length < totalDraws)
_batchData = new BatchData[totalDraws + 64];
if (_indirectCommands.Length < totalDraws)
_indirectCommands = new DrawElementsIndirectCommand[totalDraws + 64];
var groupInputs = new List<IndirectGroupInput>(totalDraws);
foreach (var g in _opaqueDraws) groupInputs.Add(ToInput(g));
foreach (var g in _translucentDraws) groupInputs.Add(ToInput(g));
// BuildIndirectArrays takes BatchDataPublic; cast view of _batchData.
// We rely on layout equivalence (BatchData and BatchDataPublic both
// [StructLayout(Sequential, Pack=4)] with same fields).
var batchView = MemoryMarshal.Cast<BatchData, BatchDataPublic>(_batchData);
var layout = BuildIndirectArrays(groupInputs, _indirectCommands, batchView.ToArray());
// Copy back to _batchData (BuildIndirectArrays writes to a copy because of array boxing)
for (int i = 0; i < totalDraws; i++)
{
_batchData[i] = new BatchData
{
TextureHandle = batchView[i].TextureHandle,
TextureLayer = batchView[i].TextureLayer,
Flags = batchView[i].Flags,
};
}
_opaqueDrawCount = layout.OpaqueCount;
_transparentDrawCount = layout.TransparentCount;
_transparentByteOffset = layout.TransparentByteOffset;
// ── Phase 5: upload three buffers ───────────────────────────────────
fixed (float* ip = _instanceData)
UploadSsbo(_instanceSsbo, 0, ip, totalInstances * 16 * sizeof(float));
fixed (BatchData* bp = _batchData)
UploadSsbo(_batchSsbo, 1, bp, totalDraws * sizeof(BatchData));
fixed (DrawElementsIndirectCommand* cp = _indirectCommands)
{
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
_gl.BufferData(BufferTargetARB.DrawIndirectBuffer,
(nuint)(totalDraws * sizeof(DrawElementsIndirectCommand)), cp, BufferUsageARB.DynamicDraw);
}
// ── Phase 6: bind global VAO once ───────────────────────────────────
_gl.BindVertexArray(anyVao);
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
_gl.Disable(EnableCap.CullFace);
// ── Phase 7: opaque pass ───────────────────────────────────────────
if (_opaqueDrawCount > 0)
{
_gl.Disable(EnableCap.Blend);
_gl.DepthMask(true);
_shader.SetInt("uRenderPass", 0);
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
_gl.MultiDrawElementsIndirect(
PrimitiveType.Triangles,
DrawElementsType.UnsignedShort,
indirect: (void*)0,
drawcount: (uint)_opaqueDrawCount,
stride: (uint)sizeof(DrawElementsIndirectCommand));
}
// ── Phase 8: transparent pass ──────────────────────────────────────
if (_transparentDrawCount > 0)
{
_gl.Enable(EnableCap.Blend);
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
_gl.DepthMask(false);
_shader.SetInt("uRenderPass", 1);
_gl.MultiDrawElementsIndirect(
PrimitiveType.Triangles,
DrawElementsType.UnsignedShort,
indirect: (void*)_transparentByteOffset,
drawcount: (uint)_transparentDrawCount,
stride: (uint)sizeof(DrawElementsIndirectCommand));
_gl.DepthMask(true);
_gl.Disable(EnableCap.Blend);
}
_gl.Disable(EnableCap.CullFace);
_gl.BindVertexArray(0);
if (diag)
{
_drawsIssued += _opaqueDrawCount + _transparentDrawCount;
_instancesIssued += totalInstances;
MaybeFlushDiag();
}
}
private static bool IsOpaqueGroup(TranslucencyKind t)
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
private static IndirectGroupInput ToInput(InstanceGroup g) => new(
IndexCount: g.IndexCount,
FirstIndex: g.FirstIndex,
BaseVertex: g.BaseVertex,
InstanceCount: g.InstanceCount,
FirstInstance: g.FirstInstance,
TextureHandle: g.BindlessTextureHandle,
TextureLayer: g.TextureLayer,
Translucency: g.Translucency);
private unsafe void UploadSsbo(uint ssbo, uint binding, void* data, int byteCount)
{
_gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, ssbo);
_gl.BufferData(BufferTargetARB.ShaderStorageBuffer, (nuint)byteCount, data, BufferUsageARB.DynamicDraw);
_gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, binding, ssbo);
}
```
Delete the old `DrawGroup`, `EnsureInstanceAttribs`, and `ResolveTexture` (the old uint-returning version) methods — they're no longer called.
- [ ] **Step 10.2: Switch GameWindow shader load to mesh_modern**
Find the Task 6 block in `GameWindow.cs` and change the shader load from `mesh_instanced` to `mesh_modern` when `_bindlessSupport != null`:
```csharp
if (_bindlessSupport is not null)
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_modern.vert"),
Path.Combine(shadersDir, "mesh_modern.frag"));
Console.WriteLine("[N.5] mesh_modern shader loaded");
}
else
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
}
```
- [ ] **Step 10.3: Build + run all tests**
Run: `dotnet build`
Expected: PASS.
Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
Expected: 60+ tests + 2 new BuildIndirectArrays tests PASS.
- [ ] **Step 10.4: Visual smoke test (USER GATE)**
Launch:
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_WB_DIAG = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task10.log
```
Expected:
- Console shows `[N.5] mesh_modern shader loaded`.
- Holtburg renders with characters + scenery + buildings visible.
- `[WB-DIAG]` shows draws dropping from N.4's hundreds to ~3-5 per frame for entity rendering.
User confirms visual identity. If broken, debug — most likely failure modes:
1. Shader compile failure → console log will show GLSL info log; fix vert/frag.
2. Black textures everywhere → bindless handle generation broken; check `_bindless` is non-null in TextureCache.
3. Wrong geometry → BaseVertex / FirstIndex misaligned; verify against N.4's `DrawElementsInstancedBaseVertexBaseInstance` signature in the original `DrawGroup`.
4. Wrong matrices on entities → InstanceSsbo upload size wrong; verify `totalInstances * 16 * sizeof(float)`.
- [ ] **Step 10.5: Commit only after visual verification passes**
```
phase(N.5) Task 10: glMultiDrawElementsIndirect dispatch — visual verified
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
Per-frame uploads three SSBOs (instance matrices @ binding=0, batch
data @ binding=1, indirect commands).
Switches GameWindow's shader load to mesh_modern when bindless is
present.
Visual verification: Holtburg courtyard renders identical to N.4.
Entity draw calls drop from "few hundred per pass" to 1 per pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 11: Update ClassifyBatches for translucency restructure (TDD)
**Files:**
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
- Create: `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`
Per Decision 2: `Additive` and `InvAlpha` merge into transparent (alpha-blend). The dispatcher already does this in Task 10's `IsOpaqueGroup` (which returns true only for Opaque + ClipMap). This task ADDS a unit test and tightens the contract.
- [ ] **Step 11.1: Write the failing test**
Create `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`:
```csharp
using AcDream.App.Rendering.Wb;
using AcDream.Core.Meshing;
using Xunit;
namespace AcDream.Core.Tests.Rendering.Wb;
/// <summary>
/// Locks in the N.5 translucency partition contract (Decision 2):
/// Opaque + ClipMap → opaque indirect; AlphaBlend + Additive + InvAlpha → transparent.
/// </summary>
public sealed class WbDrawDispatcherTranslucencyTests
{
[Theory]
[InlineData(TranslucencyKind.Opaque, true)]
[InlineData(TranslucencyKind.ClipMap, true)]
[InlineData(TranslucencyKind.AlphaBlend, false)]
[InlineData(TranslucencyKind.Additive, false)]
[InlineData(TranslucencyKind.InvAlpha, false)]
public void IsOpaque_PartitionsByKind(TranslucencyKind kind, bool expected)
{
Assert.Equal(expected, WbDrawDispatcher.IsOpaquePublic(kind));
}
}
```
- [ ] **Step 11.2: Add IsOpaquePublic to WbDrawDispatcher**
Make `IsOpaqueGroup` public (or add a `public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t);` shim):
```csharp
public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t);
```
- [ ] **Step 11.3: Run test, verify PASS**
Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherTranslucency"`
Expected: 5 tests PASS.
Run all: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
Expected: 60+ + 2 + 5 = 67+ PASS.
- [ ] **Step 11.4: Commit**
```
phase(N.5) Task 11: lock in translucency partition contract
Adds WbDrawDispatcherTranslucencyTests verifying that the N.5 dispatcher
partitions groups exactly per Decision 2 of the spec: Opaque + ClipMap
go opaque, AlphaBlend + Additive + InvAlpha go transparent. Catches
future refactors that drift the partition.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 12: Add CPU stopwatch + GL timer query timing in [WB-DIAG]
**Files:**
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
- [ ] **Step 12.1: Add timing fields**
In `WbDrawDispatcher.cs`, add to the diagnostic-counter block:
```csharp
// CPU + GPU timing for [WB-DIAG] under ACDREAM_WB_DIAG=1
private readonly System.Diagnostics.Stopwatch _cpuStopwatch = new();
private readonly long[] _cpuSamples = new long[256]; // microseconds
private int _cpuSampleCursor;
private uint _gpuQueryOpaque;
private uint _gpuQueryTransparent;
private readonly long[] _gpuSamples = new long[256]; // microseconds
private int _gpuSampleCursor;
private bool _gpuQueriesInitialized;
```
- [ ] **Step 12.2: Initialize GPU queries lazily in Draw()**
At the top of `Draw()` (after `_shader.Use()` but before `bool diag = ...`), add:
```csharp
if (diag && !_gpuQueriesInitialized)
{
_gpuQueryOpaque = _gl.GenQuery();
_gpuQueryTransparent = _gl.GenQuery();
_gpuQueriesInitialized = true;
}
```
- [ ] **Step 12.3: Wrap the draw passes with timing**
Replace `if (diag) _cpuStopwatch.Restart();` semantics — use a top-of-method `_cpuStopwatch.Restart();` (always on, cheap) and only LOG under diag.
At the very top of `Draw()` (just inside the method):
```csharp
_cpuStopwatch.Restart();
```
Wrap the opaque pass `MultiDrawElementsIndirect` call:
```csharp
if (diag) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryOpaque);
_gl.MultiDrawElementsIndirect(...); // existing call
if (diag) _gl.EndQuery(QueryTarget.TimeElapsed);
```
Same for transparent pass with `_gpuQueryTransparent`.
At the bottom of `Draw()` (after `_gl.BindVertexArray(0)`):
```csharp
_cpuStopwatch.Stop();
if (diag)
{
long cpuUs = _cpuStopwatch.ElapsedTicks * 1_000_000L / System.Diagnostics.Stopwatch.Frequency;
_cpuSamples[_cpuSampleCursor] = cpuUs;
_cpuSampleCursor = (_cpuSampleCursor + 1) % _cpuSamples.Length;
// GPU sample read — non-blocking, may not be ready yet on first frames
int avail = 0;
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResultAvailable, out avail);
if (avail != 0)
{
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResult, out long opaqueNs);
_gl.GetQueryObject(_gpuQueryTransparent, QueryObjectParameterName.QueryResult, out long transNs);
long gpuUs = (opaqueNs + transNs) / 1000;
_gpuSamples[_gpuSampleCursor] = gpuUs;
_gpuSampleCursor = (_gpuSampleCursor + 1) % _gpuSamples.Length;
}
}
```
- [ ] **Step 12.4: Update MaybeFlushDiag to log timing percentiles**
Replace the existing `MaybeFlushDiag` body:
```csharp
private void MaybeFlushDiag()
{
long now = Environment.TickCount64;
if (now - _lastLogTick > 5000)
{
long cpuMed = MedianMicros(_cpuSamples);
long cpuP95 = Percentile95Micros(_cpuSamples);
long gpuMed = MedianMicros(_gpuSamples);
long gpuP95 = Percentile95Micros(_gpuSamples);
Console.WriteLine(
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count} " +
$"cpu_us={cpuMed}m/{cpuP95}p95 gpu_us={gpuMed}m/{gpuP95}p95");
_entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0;
_lastLogTick = now;
}
}
private static long MedianMicros(long[] samples)
{
var copy = (long[])samples.Clone();
Array.Sort(copy);
int nz = 0;
foreach (var v in copy) if (v > 0) { nz++; }
if (nz == 0) return 0;
return copy[copy.Length - nz / 2];
}
private static long Percentile95Micros(long[] samples)
{
var copy = (long[])samples.Clone();
Array.Sort(copy);
int nz = 0;
foreach (var v in copy) if (v > 0) { nz++; }
if (nz == 0) return 0;
int idx = copy.Length - 1 - (int)(nz * 0.05);
return copy[idx];
}
```
- [ ] **Step 12.5: Update Dispose**
Add to `Dispose()`:
```csharp
if (_gpuQueriesInitialized)
{
_gl.DeleteQuery(_gpuQueryOpaque);
_gl.DeleteQuery(_gpuQueryTransparent);
}
```
- [ ] **Step 12.6: Build + smoke test**
Run: `dotnet build`
Expected: PASS.
Smoke launch with `ACDREAM_WB_DIAG=1`. Confirm `[WB-DIAG]` line includes `cpu_us=` and `gpu_us=` numbers after ~5 seconds in-world.
- [ ] **Step 12.7: Commit**
```
phase(N.5) Task 12: CPU stopwatch + GL_TIME_ELAPSED queries in [WB-DIAG]
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
queries (opaque + transparent), polled non-blocking on next frame.
Numbers populate the SHIP commit message (Task 20).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 13: Capture before/after perf numbers (USER GATE)
**Files:**
- (none — measurement task)
- [ ] **Step 13.1: Capture N.5 numbers in Holtburg courtyard**
Launch acdream with `ACDREAM_WB_DIAG=1`. Position character at Holtburg courtyard, 30m elevated, looking SW. Stand still for ~30 seconds. Read the `[WB-DIAG]` line. Record:
```
N.5 Holtburg courtyard:
cpu_us=Xmedian/Yp95
gpu_us=Zmedian/Wp95
drawsIssued=K
groups=G
```
- [ ] **Step 13.2: Capture N.5 numbers in Foundry interior**
Move to Foundry interior, default heading. Same 30s. Record same metrics.
- [ ] **Step 13.3: Compare against N.4 baseline**
Stash N.5 changes:
```bash
git stash
git checkout c445364 # N.4 SHIP
dotnet build
```
Repeat measurements with N.4 active. Record numbers in the same format. Compare:
| Scene | N.4 cpu med | N.5 cpu med | Δ% | N.4 gpu med | N.5 gpu med | Δ% | N.4 draws | N.5 draws |
|---|---|---|---|---|---|---|---|---|
| Holtburg courtyard | | | | | | | | |
| Foundry interior | | | | | | | | |
Restore N.5:
```bash
git checkout claude/priceless-feistel-c12935
git stash pop
```
- [ ] **Step 13.4: Verify acceptance gates**
Acceptance per spec §8.3:
- [ ] CPU dispatcher time ≤ 70% of N.4 in Holtburg courtyard (target: ≥30% reduction).
- [ ] GPU rendering time within ±10% of N.4 (sanity).
- [ ] `drawsIssued ≤ 5 per pass`.
If gates fail: investigate. Common causes:
- Per-frame `glBufferData` is the bottleneck → defer to N.6 persistent-mapping (per Decision 7).
- SSBO indexing slower than expected on driver → check NVidia / AMD / Intel separately.
- Group bucketing not sharing groups well → `groups` count dominates `drawsIssued`.
Save the table to a file: `docs/plans/2026-05-08-phase-n5-perf-baseline.md`. This goes in the SHIP commit.
- [ ] **Step 13.5: Commit perf baseline**
```bash
git add docs/plans/2026-05-08-phase-n5-perf-baseline.md
git commit -m "phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry
[heredoc body]"
```
Heredoc body:
```
phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry
Captures CPU + GPU + draw-count numbers for the SHIP gate.
Acceptance gates:
- CPU dispatcher time ≤ 70% of N.4: [PASS / FAIL]
- GPU rendering time within ±10% of N.4: [PASS / FAIL]
- drawsIssued ≤ 5 per pass: [PASS / FAIL]
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 14: Visual verification at Holtburg + Foundry + magic content (USER GATE)
**Files:**
- (none — verification task; only commits if regressions found)
- [ ] **Step 14.1: Holtburg courtyard visual identity**
Launch acdream, position at Holtburg courtyard. Compare side-by-side against N.4 (use git stash + checkout flow from Task 13 if needed). Confirm:
- All scenery (trees, fences, rocks, buildings) renders correctly.
- No missing entities.
- No z-fighting introduced.
- No exploded character parts.
- [ ] **Step 14.2: Foundry interior visual identity**
Move to Foundry. Confirm same checklist. Pay attention to dense static-object scenes.
- [ ] **Step 14.3: Indoor → outdoor transition**
Walk through portal/door from outdoors to indoors and back. Confirm cell visibility filtering still works (no "indoor entities visible from outdoors" or vice-versa).
- [ ] **Step 14.4: Drudge / character close-up**
Find a drudge or NPC. Walk close. Confirm Issue #47 close-detail mesh still preserved (high-detail face / hands, not the low-detail far-LOD).
- [ ] **Step 14.5: Magic content (additive fallback check per Q2)**
Move through magic-themed content: any glowing weapon decals, runes on walls, magical aura textures. Compare against N.4. If anything appears "darker" or "less luminous" → that's the Decision 2 additive regression.
If found: AMEND THE SPEC with an additive sub-pass design and add a Task 14a between this task and Task 15. Do NOT proceed to ship without resolving.
- [ ] **Step 14.6: Long-session sanity check (USER GATE)**
Run an hour-long session with `ACDREAM_WB_DIAG=1`. Watch the `[WB-DIAG]` resident handle count grow (you'll need to add a `bindlessHandlesCount` field to the diag log — small task; if not done, just monitor process VRAM via Task Manager / similar). Expected: bounded plateau under 5K handles.
If unbounded growth: file an N.6 follow-up issue, don't block the ship.
- [ ] **Step 14.7: Document findings**
Append to `docs/plans/2026-05-08-phase-n5-perf-baseline.md`:
```markdown
## Visual verification (Task 14)
- Holtburg courtyard: PASS / FAIL (note specific issues)
- Foundry interior: PASS / FAIL
- Cell transitions: PASS / FAIL
- Character close-up (Issue #47): PASS / FAIL
- Magic content (additive check): PASS / FAIL
- Long-session sanity: PASS / FAIL — peak resident handles ~N
```
- [ ] **Step 14.8: Commit findings (no code change)**
```
phase(N.5) Task 14: visual verification — all gates pass
[Or if any failed: amend with sub-task to address.]
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 15: Delete legacy mesh_instanced shader files
**Files:**
- Delete: `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert`
- Delete: `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag`
- Modify: `src/AcDream.App/Rendering/GameWindow.cs` (remove fallback path)
This task removes the fallback shader path. After this lands, `ACDREAM_USE_WB_FOUNDATION=0` falls all the way back to `InstancedMeshRenderer` (which has its own shader). The intermediate "WB foundation on but bindless missing" state no longer exists — if bindless is missing, we treat it as foundation-off.
- [ ] **Step 15.1: Delete shader files**
```bash
git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.vert
git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.frag
```
- [ ] **Step 15.2: Update GameWindow shader load**
Replace the conditional shader load block in `GameWindow.cs` with the single modern path:
```csharp
if (_bindlessSupport is not null)
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_modern.vert"),
Path.Combine(shadersDir, "mesh_modern.frag"));
Console.WriteLine("[N.5] mesh_modern shader loaded");
}
else
{
// Bindless missing — log and skip WbDrawDispatcher construction so
// InstancedMeshRenderer handles all rendering (same effect as
// ACDREAM_USE_WB_FOUNDATION=0).
Console.WriteLine("[N.5] bindless extension missing — falling back to InstancedMeshRenderer");
// _meshShader stays unloaded; InstancedMeshRenderer owns its own shader path.
// The `_dispatcher = new WbDrawDispatcher(...)` site below must be wrapped:
// _dispatcher = (_bindlessSupport is not null) ? new WbDrawDispatcher(...) : null;
// and the per-frame draw call must guard `_dispatcher?.Draw(...)`.
}
```
Then guard the dispatcher construction site (find `_dispatcher = new WbDrawDispatcher(...)` in the same file):
```csharp
_dispatcher = (_bindlessSupport is not null)
? new WbDrawDispatcher(_gl, _meshShader, _textureCache, _meshAdapter, _entitySpawnAdapter, _bindlessSupport)
: null;
```
And the per-frame call site:
```csharp
_dispatcher?.Draw(camera, landblockEntries, frustum, ...);
```
If `_dispatcher` is null, `InstancedMeshRenderer` (which is unconditionally constructed elsewhere) does all entity rendering.
- [ ] **Step 15.3: Build + tests**
Run: `dotnet build`
Expected: PASS.
Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
Expected: PASS.
- [ ] **Step 15.4: Smoke test (legacy fallback path)**
Test the legacy fallback by running with foundation off:
```powershell
$env:ACDREAM_USE_WB_FOUNDATION = "0"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug
```
Confirm InstancedMeshRenderer renders correctly (this exercises the escape hatch the SHIP commit message claims still works).
- [ ] **Step 15.5: Commit**
```
phase(N.5) Task 15: delete legacy mesh_instanced shader files
mesh_instanced.vert + .frag deleted. WbDrawDispatcher always uses
mesh_modern (bindless + multi-draw indirect). Legacy escape hatch
runs via InstancedMeshRenderer + ACDREAM_USE_WB_FOUNDATION=0 — its
own shader path, untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 16: Update CLAUDE.md WB integration cribs
**Files:**
- Modify: `CLAUDE.md`
- [ ] **Step 16.1: Read existing WB integration cribs section**
Read `CLAUDE.md` lines 28-80 (the "WB integration cribs" section).
- [ ] **Step 16.2: Add N.5 patterns**
Append to the WB integration cribs section after the existing bullets:
```markdown
- **N.5 modern dispatch** uses bindless textures + multi-draw indirect.
`WbDrawDispatcher.Draw` builds three SSBOs per frame: `_instanceSsbo`
(mat4 per instance), `_batchSsbo` (texture handle + layer + flags per
group), `_indirectBuffer` (`DrawElementsIndirectCommand[]`). Two
`glMultiDrawElementsIndirect` calls per frame — opaque, transparent.
See `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
- **`TextureCache` requires `BindlessSupport`** for the WB modern path.
Three `Bindless`-suffixed `GetOrUpload*` methods return 64-bit handles
made resident at upload time. Old `uint`-returning methods stay for
Sky / Terrain / Debug renderers.
- **Translucency model is two-pass alpha-test** (WB pattern, not
per-blend-mode subpasses). Opaque pass discards `α<0.95`, transparent
pass discards `α≥0.95`. Native `Additive` blend renders as alpha-blend
on GfxObj surfaces — falsifiable; if a regression shows up on magic
content, add a third indirect call with `glBlendFunc(SrcAlpha, One)`.
- **Per-instance highlight (selection blink) is reserved.** `InstanceData`
has a documented hook for `vec4 highlightColor` — Phase B.4 follow-up
adds the field + plumbs server-side selection state. Stride grows from
64 → 80 bytes when added; shader updates trivially.
```
- [ ] **Step 16.3: Build (sanity — markdown only, but ensures no other docs broke)**
Run: `dotnet build`
Expected: PASS.
- [ ] **Step 16.4: Commit**
```
phase(N.5) Task 16: extend CLAUDE.md WB cribs with N.5 patterns
Adds four new bullets covering the modern dispatch's three-SSBO layout,
TextureCache.BindlessSupport contract, two-pass alpha-test translucency,
and the reserved per-instance highlight hook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 17: Update memory + roadmap
**Files:**
- Create: `memory/project_phase_n5_state.md` (under user's `~/.claude/projects/.../memory/`)
- Modify: `MEMORY.md` (under user's `~/.claude/projects/.../memory/`)
- Modify: `docs/plans/2026-04-11-roadmap.md`
Memory files live under `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\` per the `auto memory` system prompt section.
- [ ] **Step 17.1: Create memory entry for N.5 state**
Create `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\project_phase_n5_state.md`:
```markdown
---
name: Project: Phase N.5 state (shipped 2026-05-XX)
description: N.5 lifted WbDrawDispatcher onto bindless + multi-draw indirect. CPU dispatcher time dropped to ~30-40% of N.4. Three new gotchas captured.
type: project
---
**Phase N.5 — Modern Rendering Path — shipped 2026-05-XX.**
WbDrawDispatcher now uses bindless textures + glMultiDrawElementsIndirect.
Per-frame: 3 SSBO uploads + 2 indirect calls (opaque + transparent). All
textures are 1-layer Texture2DArray; sampler2DArray in shader.
Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
Spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
**Why:** N.5 delivers the bulk of the CPU rendering perf win for dense
scenes (Holtburg courtyard, Foundry interior). N.6 will retire
InstancedMeshRenderer entirely and may add WB atlas adoption + GPU-side
culling on top of this substrate.
**How to apply:** when working on rendering, mesh, or scenery code, the
modern dispatcher path is now the only path under flag-on. Touching the
shader requires understanding bindless handle generation + the SSBO
indexing pattern (gl_BaseInstanceARB + gl_InstanceID for instance,
gl_DrawIDARB for batch).
## Three gotchas surfaced during N.5 implementation
[FILL IN AT SHIP TIME — common candidates:]
1. SSBO upload size off-by-one if you forget instance-stride alignment.
2. `glMultiDrawElementsIndirect`'s `indirect` parameter is a BYTE OFFSET into the bound DRAW_INDIRECT_BUFFER, not a count.
3. Bindless handle 0 is a valid-but-non-resident sentinel — guard for it before populating BatchData.
```
- [ ] **Step 17.2: Add MEMORY.md index entry**
Edit `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\MEMORY.md`. Add immediately after the existing N.4 line:
```markdown
- [Project: Phase N.5 state](project_phase_n5_state.md) — **N.5 SHIPPED 2026-05-XX.** WbDrawDispatcher on bindless + multi-draw indirect. CPU dispatcher ~30-40% of N.4. Three driver-touching gotchas captured.
```
- [ ] **Step 17.3: Update roadmap**
Edit `docs/plans/2026-04-11-roadmap.md`. Move N.5 from "Currently in flight" to the "Shipped" table. Add N.6 as the new "in flight" or "next" entry per the user's preferred sequencing.
- [ ] **Step 17.4: Commit memory + roadmap**
```bash
git add docs/plans/2026-04-11-roadmap.md
git commit -m "phase(N.5): roadmap — N.5 shipped, N.6 next
[heredoc body]"
```
(Memory files are git-ignored — they live under `~/.claude/...` and are not committed.)
Heredoc body:
```
phase(N.5): roadmap — N.5 shipped, N.6 next
Moves N.5 from in-flight to Shipped. Records the perf wins from
Task 13's measurement table. N.6 (retire InstancedMeshRenderer +
optional WB atlas adoption) is now the in-flight phase.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Task 18: Plan finalization — append SHIP section
**Files:**
- Modify: `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md` (this file)
- [ ] **Step 18.1: Add SHIP section at the end of this plan**
Append to this plan file (`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`):
```markdown
---
## SHIP record
**Shipped: 2026-05-XX** at commit [SHIP commit SHA].
**Acceptance gates:**
- [✓] Visual identity to N.4 — confirmed at Holtburg courtyard, Foundry interior, indoor↔outdoor transitions, drudge close-up, magic content.
- [✓] CPU dispatcher time ≤ 70% of N.4 — measured: N.4=Xµs / N.5=Yµs (Z% reduction).
- [✓] GPU rendering time within ±10% of N.4 — measured: N.4=Aµs / N.5=Bµs.
- [✓] `drawsIssued ≤ 5 per pass` — measured: N opaque + M transparent per frame.
- [✓] All tests green — 60+ N.4 tests + 7 new N.5 tests.
- [✓] `ACDREAM_USE_WB_FOUNDATION=0` still works — InstancedMeshRenderer fallback verified.
**Adjustments captured during execution:** [list any spec amendments — e.g., additive sub-pass added if Task 14.5 found regressions].
**Out-of-scope follow-ups (per spec §10):**
- N.6: retire `InstancedMeshRenderer`.
- N.6 candidate: persistent-mapped buffers if `glBufferData` shows up in profiling.
- N.6 candidate: WB atlas adoption for memory savings on shared content.
- Phase B.4 follow-up: per-instance `highlightColor` for selection blink.
- (Long-session memory pressure — log evidence in N.6 watchlist.)
```
- [ ] **Step 18.2: Commit**
```bash
git add docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
git commit -m "phase(N.5): plan finalization — SHIP record appended
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
## Task 19: SHIP commit
**Files:**
- (no code change — single empty commit OR amend the perf baseline commit's message)
- [ ] **Step 19.1: Verify clean tree + green build/test**
```bash
git status
dotnet build
dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"
```
Expected: clean tree, build PASS, all tests PASS.
- [ ] **Step 19.2: Create SHIP commit**
```bash
git commit --allow-empty -m "phase(N.5): SHIP — modern rendering path on N.4 dispatcher
[heredoc body]"
```
Heredoc body:
```
phase(N.5): SHIP — modern rendering path on N.4 dispatcher
Bindless textures + glMultiDrawElementsIndirect. Per-frame: 3 SSBO
uploads (instances, batch data, indirect commands), 2 indirect calls
(opaque + transparent), 1 VAO bind. Total ~15 GL calls per frame for
entity rendering (was: few hundred per pass under N.4).
Acceptance gates (from spec §8.3):
- Visual identity to N.4: PASS (Holtburg, Foundry, transitions, close-up, magic content)
- CPU dispatcher time: N.4=[Xµs] → N.5=[Yµs] ([Z]% reduction; gate ≥30%)
- GPU rendering time: within ±10% of N.4 — PASS
- drawsIssued ≤ 5 per pass: PASS
- All tests green: PASS (67+ tests)
- Legacy fallback (ACDREAM_USE_WB_FOUNDATION=0): PASS
Plan archived at docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
- [ ] **Step 19.3: Confirm commit**
```bash
git log --oneline -5
```
Expected: top commit is "phase(N.5): SHIP — ...".
---
## Self-review checklist
After all tasks complete, verify against the spec:
- [ ] **Spec §2 Decision 1** (sampler2DArray): TextureCache uploads as Texture2DArray (Task 2). Shader samples via `sampler2DArray` (Task 5). ✓
- [ ] **Spec §2 Decision 2** (two-pass alpha-test): Shader uses `uRenderPass` discard (Task 5). Dispatcher runs two passes (Task 10). Translucency partition test (Task 11). ✓
- [ ] **Spec §2 Decision 3** (SSBO): `_instanceSsbo` + `_batchSsbo` at bindings 0+1 (Tasks 7+10). Shader reads via `gl_BaseInstanceARB` + `gl_DrawIDARB` (Task 5). ✓
- [ ] **Spec §2 Decision 4** (resident on upload): `MakeResidentHandle` (Task 3) + Dispose order (Task 4). ✓
- [ ] **Spec §2 Decision 5** (two-way flag): Capability check + fallback in GameWindow (Task 6+15). ✓
- [ ] **Spec §2 Decision 6** (CPU stopwatch + GL queries): Task 12. Numbers in SHIP message (Task 19). ✓
- [ ] **Spec §2 Decision 7** (defer persistent-mapped): No persistent-mapped code in this plan. ✓
- [ ] **Spec §2 Decision 8** (defer highlight): InstanceData comment reserves field (Task 5). ✓
- [ ] **Spec §4.1 TextureCache changes**: Tasks 2-4. ✓
- [ ] **Spec §4.2 WbDrawDispatcher changes**: Tasks 7-10. ✓
- [ ] **Spec §4.3 New shader files**: Task 5. ✓
- [ ] **Spec §6 Translucency detail**: Tasks 10-11. ✓
- [ ] **Spec §7 Error handling**: Task 6 (capability + compile fallback) + Task 4 (disposal order). ✓
- [ ] **Spec §8 Testing**: Task 9 (indirect builder), Task 11 (translucency), Task 13 (perf), Task 14 (visual). ✓
- [ ] **Spec §9 Risks**: Capability check + fallback paths in Tasks 6+15. ✓
No placeholders. No "implement later" tasks. Every step has either code or an exact command.
---
*End of plan.*