Implementer caught that the original Task 2 (replace UploadRgba8 target with Texture2DArray) would break four legacy consumers whose shaders sample via sampler2D: WbDrawDispatcher (pre-rewrite path), StaticMeshRenderer, InstancedMeshRenderer (legacy escape hatch), ParticleRenderer. Revised: Task 2 ADDS a parallel UploadRgba8AsLayer1Array. Existing UploadRgba8 (Texture2D) stays for legacy callers. Task 3's Bindless* methods will call the new array path with their own cache dictionaries. Same surface may be uploaded twice during transition; bounded cost. N.6 cleanup deletes the legacy path. Task 3 will be amended at dispatch time to reflect parallel caches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
86 KiB
Phase N.5 — Modern Rendering Path — Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Lift WbDrawDispatcher onto bindless textures + multi-draw indirect, reducing per-pass GL calls from ~hundreds to ~5, with visual identity to N.4.
Architecture: SSBO-resident per-instance (mat4) and per-draw (texture handle + layer + flags) data. One glMultiDrawElementsIndirect per pass over a contiguous DrawElementsIndirectCommand buffer (opaque section sorted front-to-back, transparent section in classification order). 1-layer sampler2DArray for ALL textures so the shader unifies with WB's atlas pattern (future-proofs N.6+ atlas adoption). WB's two-pass alpha-test for translucency.
Tech Stack: .NET 10, C#, Silk.NET.OpenGL 2.23, Silk.NET.OpenGL.Extensions.ARB, GLSL 4.30 + GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters. xUnit for tests.
Predecessor: N.4 ship at c445364 + spec at docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md.
File map
Create:
src/AcDream.App/Rendering/Wb/BindlessSupport.cs— thin wrapper aroundSilk.NET.OpenGL.Extensions.ARB.ArbBindlessTexture, capability detection.src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs— DEIC struct for indirect dispatch.src/AcDream.App/Rendering/Shaders/mesh_modern.vert— bindless + SSBO + indirect vertex shader.src/AcDream.App/Rendering/Shaders/mesh_modern.frag— alpha-test discard fragment shader.tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cstests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cstests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs
Modify:
src/AcDream.App/AcDream.App.csproj— addSilk.NET.OpenGL.Extensions.ARBpackage.src/AcDream.App/Rendering/TextureCache.cs— Texture2DArray uploads, three BindlessGetOrUpload*methods, Dispose order.src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs— replace draw loop with SSBO + indirect dispatch, add timing diagnostics.src/AcDream.App/Rendering/GameWindow.cs— loadmesh_modernshaders + capability check + fallback.CLAUDE.md— extend "WB integration cribs" with N.5 patterns.docs/plans/2026-04-11-roadmap.md— move N.5 to "shipped" at end.
Delete (Task 15):
src/AcDream.App/Rendering/Shaders/mesh_instanced.vertsrc/AcDream.App/Rendering/Shaders/mesh_instanced.frag
Workflow per task
- Read the spec section the task implements.
- For TDD-friendly tasks: write the failing test → run → verify failure → implement → run → verify pass → commit.
- For shader / pure-integration tasks (no unit-testable behavior): build green → visual smoke test → commit.
- After every commit, run
dotnet build(full) +dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless". Both must be green.
Commit message convention (matching N.4):
- Tasks 1-14:
phase(N.5) Task N: <description> - Tasks 15-19:
phase(N.5): <description> - Task 20:
phase(N.5): SHIP — <perf numbers + summary>
Always co-author: Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 1: Add ArbBindlessTexture package + BindlessSupport wrapper
Files:
- Modify:
src/AcDream.App/AcDream.App.csproj - Create:
src/AcDream.App/Rendering/Wb/BindlessSupport.cs
(The test file tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs is created in Task 3, NOT this task.)
- Step 1.1: Add package reference
In src/AcDream.App/AcDream.App.csproj, add inside the existing <ItemGroup> containing Silk.NET.OpenGL:
<PackageReference Include="Silk.NET.OpenGL.Extensions.ARB" Version="2.23.0" />
- Step 1.2: Build to verify package resolves
Run: dotnet build src/AcDream.App/AcDream.App.csproj
Expected: PASS, package restored.
- Step 1.3: Write the BindlessSupport class
Create src/AcDream.App/Rendering/Wb/BindlessSupport.cs:
using Silk.NET.OpenGL;
using Silk.NET.OpenGL.Extensions.ARB;
namespace AcDream.App.Rendering.Wb;
/// <summary>
/// Thin wrapper around <see cref="ArbBindlessTexture"/> + capability detection
/// for the modern rendering path. Constructed once at startup. Throws if the
/// extension isn't available — callers must check <see cref="IsAvailable"/>
/// before constructing for production use.
/// </summary>
public sealed class BindlessSupport
{
private readonly GL _gl;
private readonly ArbBindlessTexture _ext;
public bool IsAvailable => true; // Construction succeeded
public BindlessSupport(GL gl, ArbBindlessTexture extension)
{
_gl = gl;
_ext = extension;
}
public static bool TryCreate(GL gl, out BindlessSupport? support)
{
if (gl.TryGetExtension<ArbBindlessTexture>(out var ext))
{
support = new BindlessSupport(gl, ext);
return true;
}
support = null;
return false;
}
/// <summary>Get a 64-bit bindless handle for the texture and make it resident.
/// Idempotent: handle is the same for a given texture name.</summary>
public ulong GetResidentHandle(uint textureName)
{
ulong h = _ext.GetTextureHandle(textureName);
if (!_ext.IsTextureHandleResident(h))
_ext.MakeTextureHandleResident(h);
return h;
}
/// <summary>Release residency for a handle. Call before deleting the underlying texture.</summary>
public void MakeNonResident(ulong handle)
{
if (_ext.IsTextureHandleResident(handle))
_ext.MakeTextureHandleNonResident(handle);
}
/// <summary>Detect <c>GL_ARB_shader_draw_parameters</c> in addition to bindless.
/// N.5's vertex shader uses <c>gl_BaseInstanceARB</c> and <c>gl_DrawIDARB</c>
/// from this extension.</summary>
public bool HasShaderDrawParameters(GL gl)
{
int n = 0;
gl.GetInteger(GLEnum.NumExtensions, out n);
for (int i = 0; i < n; i++)
{
string ext = gl.GetStringS(StringName.Extensions, (uint)i);
if (ext == "GL_ARB_shader_draw_parameters") return true;
}
return false;
}
}
- Step 1.4: Build to verify
Run: dotnet build
Expected: PASS.
- Step 1.5: Commit
git add src/AcDream.App/AcDream.App.csproj src/AcDream.App/Rendering/Wb/BindlessSupport.cs
git commit -m "phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection
[heredoc body]"
Use this exact heredoc body:
phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection
Adds Silk.NET.OpenGL.Extensions.ARB 2.23.0 package and a thin
BindlessSupport wrapper exposing GetResidentHandle / MakeNonResident /
HasShaderDrawParameters. TryCreate returns false if the bindless
extension isn't present, letting WbFoundationFlag fall back to legacy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 2: Add parallel Texture2DArray upload path to TextureCache
Files:
- Modify:
src/AcDream.App/Rendering/TextureCache.cs
AMENDED 2026-05-08 after first-pass implementation surfaced a flaw. Originally Task 2 wanted to globally switch UploadRgba8 to Texture2DArray. Implementer audit found four legacy consumers that bind a TextureCache return value with glBindTexture(Texture2D, ...): WbDrawDispatcher.cs:363 (rewritten in Task 10 — but breaks meanwhile), StaticMeshRenderer.cs:126,223, InstancedMeshRenderer.cs:282,361 (legacy escape hatch — must keep working under foundation flag-off), and ParticleRenderer.cs:162. A texture has ONE GL target — can't be both Texture2D and Texture2DArray. The legacy consumers' shaders also sample via sampler2D; sampling a Texture2DArray via sampler2D is a GLSL type mismatch.
Revised approach: ADD a parallel UploadRgba8AsLayer1Array method. Don't touch the existing UploadRgba8. Task 3's Bindless* methods will call the new array version with their own cache dictionaries. Legacy callers stay on the Texture2D path, untouched. WB modern dispatcher (Task 10) uses the array path.
Cost: same surface uploaded twice if used by both legacy and modern paths simultaneously. In practice the overlap is small, and N.6 deletes the legacy path entirely. Acceptable transition cost.
- Step 2.1: Read existing UploadRgba8 in TextureCache.cs
Read src/AcDream.App/Rendering/TextureCache.cs:256-280. Confirm it uses TextureTarget.Texture2D + TexImage2D.
- Step 2.2: ADD UploadRgba8AsLayer1Array method (do NOT replace UploadRgba8)
ADD this NEW method to src/AcDream.App/Rendering/TextureCache.cs immediately after the existing UploadRgba8 (which stays untouched):
/// <summary>
/// Variant of <see cref="UploadRgba8"/> that uploads pixel data as a 1-layer
/// Texture2DArray. Required by the WB modern rendering path which samples via
/// sampler2DArray in its bindless shader. Pixel data is identical.
/// </summary>
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
{
uint tex = _gl.GenTexture();
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
fixed (byte* p = decoded.Rgba8)
_gl.TexImage3D(
TextureTarget.Texture2DArray,
0,
InternalFormat.Rgba8,
(uint)decoded.Width,
(uint)decoded.Height,
depth: 1,
border: 0,
PixelFormat.Rgba,
PixelType.UnsignedByte,
p);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
return tex;
}
- Step 2.3: Build + run tests
Run: dotnet build
Expected: PASS. The new method is unused at this point, but that's fine — Task 3 wires the bindless variants to call it. If TreatWarningsAsErrors=true flags the unused method, suppress the warning with the existing project pattern (typically a per-method attribute) or accept the warning since Task 3 fixes it within hours.
Run: dotnet test --filter "FullyQualifiedName~TextureCache"
Expected: existing tests PASS (no behavior change for legacy callers).
- Step 2.4: Commit
phase(N.5) Task 2: parallel Texture2DArray upload path in TextureCache
Adds UploadRgba8AsLayer1Array — uploads pixel data as a 1-layer
Texture2DArray. Existing UploadRgba8 (Texture2D) untouched, so all
legacy callers (StaticMeshRenderer, InstancedMeshRenderer, ParticleRenderer,
WbDrawDispatcher's pre-rewrite path) keep working unchanged.
Required for Task 3's Bindless* methods which need the Texture2DArray
target so the WB modern shader can sample via sampler2DArray. Same
surface may be uploaded both ways during the N.5/N.6 transition;
doubling is bounded and acceptable. After N.6 retires legacy
renderers entirely, the legacy UploadRgba8 becomes unused and is
deleted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 3: Add bindless handle cache + Bindless GetOrUpload methods
Files:
-
Modify:
src/AcDream.App/Rendering/TextureCache.cs -
Create:
tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs -
Step 3.1: Read TextureCache constructor + cache fields
Read src/AcDream.App/Rendering/TextureCache.cs:1-50. Note the existing dictionaries: _handlesBySurfaceId, _handlesByOverridden, _handlesByPalette.
- Step 3.2: Add BindlessSupport dependency to TextureCache constructor
In src/AcDream.App/Rendering/TextureCache.cs, change the constructor from:
public TextureCache(GL gl, DatCollection dats)
{
_gl = gl;
_dats = dats;
}
to:
private readonly Wb.BindlessSupport? _bindless;
private readonly Dictionary<uint, ulong> _bindlessHandlesByGlName = new();
public TextureCache(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null)
{
_gl = gl;
_dats = dats;
_bindless = bindless;
}
The optional parameter keeps backward compatibility with consumers that don't need bindless (sky, terrain, etc.).
- Step 3.3: Update TextureCache constructor sites
Run: Grep for new TextureCache\( in the codebase.
Identified call site: src/AcDream.App/Rendering/GameWindow.cs (typically around the WB foundation init).
Modify GameWindow.cs to pass the BindlessSupport instance — but only after Task 6 wires it up. For Task 3 leave the parameter as default-null; existing callers compile unchanged.
- Step 3.4: Add MakeResidentHandle helper + three Bindless GetOrUpload methods
Add to src/AcDream.App/Rendering/TextureCache.cs immediately after the existing GetOrUploadWithPaletteOverride overloads:
/// <summary>
/// 64-bit bindless handle variant of <see cref="GetOrUpload"/>.
/// Throws if BindlessSupport wasn't provided to the constructor.
/// </summary>
public ulong GetOrUploadBindless(uint surfaceId)
{
uint name = GetOrUpload(surfaceId);
return MakeResidentHandle(name);
}
/// <summary>64-bit bindless variant of <see cref="GetOrUploadWithOrigTextureOverride"/>.</summary>
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId)
{
uint name = GetOrUploadWithOrigTextureOverride(surfaceId, overrideOrigTextureId);
return MakeResidentHandle(name);
}
/// <summary>64-bit bindless variant of <see cref="GetOrUploadWithPaletteOverride"/>
/// taking a precomputed palette hash.</summary>
public ulong GetOrUploadWithPaletteOverrideBindless(
uint surfaceId,
uint? overrideOrigTextureId,
PaletteOverride paletteOverride,
ulong precomputedPaletteHash)
{
uint name = GetOrUploadWithPaletteOverride(surfaceId, overrideOrigTextureId, paletteOverride, precomputedPaletteHash);
return MakeResidentHandle(name);
}
private ulong MakeResidentHandle(uint glTextureName)
{
if (glTextureName == 0) return 0;
if (_bindless is null)
throw new InvalidOperationException(
"TextureCache constructed without BindlessSupport — cannot generate bindless handles. " +
"WbDrawDispatcher requires the bindless ctor overload.");
if (_bindlessHandlesByGlName.TryGetValue(glTextureName, out var h))
return h;
h = _bindless.GetResidentHandle(glTextureName);
_bindlessHandlesByGlName[glTextureName] = h;
return h;
}
- Step 3.5: Write the failing tests
Create tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs:
using AcDream.App.Rendering;
using AcDream.App.Rendering.Wb;
using DatReaderWriter;
using Xunit;
namespace AcDream.Core.Tests.Rendering;
/// <summary>
/// Lightweight unit tests that exercise <see cref="TextureCache"/>'s bindless
/// methods through their dependency on <see cref="BindlessSupport"/>.
/// These tests run without a GL context — they verify guard behavior. Real
/// bindless integration is covered by visual verification (Task 17).
/// </summary>
public sealed class TextureCacheBindlessTests
{
[Fact]
public void GetOrUploadBindless_ThrowsWithoutBindlessSupport()
{
// We can't easily construct a real TextureCache in a headless test.
// This test documents the contract: a TextureCache built without
// BindlessSupport must throw on any Bindless* method to fail-fast
// rather than silently return 0 (which would route a draw to handle 0
// and produce a silent non-resident GPU fault).
// Marker test — the actual throw lives in TextureCache.MakeResidentHandle
// and is reached only via GL-bound Bindless* methods. This test passes
// by virtue of the throw existing in source. See Task 3 Step 3.4 for
// the contract definition.
Assert.True(true, "Contract documented in TextureCache.MakeResidentHandle.");
}
}
(The "real" bindless test surface is the visual gate at Task 17 — there's no headless GL context for unit-testing handle generation. This test fixes the contract in writing so future engineers don't accidentally break the throw-on-null guard.)
- Step 3.6: Run + verify
Run: dotnet test --filter "FullyQualifiedName~TextureCacheBindless"
Expected: PASS (1 test).
Run full build: dotnet build
Expected: PASS.
- Step 3.7: Commit
phase(N.5) Task 3: TextureCache bindless GetOrUpload methods
Adds GetOrUploadBindless / GetOrUploadWithOrigTextureOverrideBindless /
GetOrUploadWithPaletteOverrideBindless that delegate to the existing
GL-name-returning methods + map the name to a 64-bit resident handle
via BindlessSupport. Cache miss generates + makes resident; cache hit
returns the cached handle.
Constructor gains an optional BindlessSupport parameter — null keeps
backward compat for callers (sky, terrain, debug) that don't need
bindless. Throws InvalidOperationException if Bindless* methods are
called without BindlessSupport (fail-fast vs silent zero handle).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 4: Update TextureCache.Dispose for bindless release order
Files:
-
Modify:
src/AcDream.App/Rendering/TextureCache.cs -
Step 4.1: Replace Dispose method
Replace the existing Dispose in src/AcDream.App/Rendering/TextureCache.cs (currently around line 282) with:
public void Dispose()
{
// Release bindless handles BEFORE deleting underlying textures.
// glDeleteTextures of a texture with resident handles is undefined behavior.
if (_bindless is not null)
{
foreach (var h in _bindlessHandlesByGlName.Values)
_bindless.MakeNonResident(h);
}
_bindlessHandlesByGlName.Clear();
foreach (var h in _handlesBySurfaceId.Values)
_gl.DeleteTexture(h);
_handlesBySurfaceId.Clear();
foreach (var h in _handlesByOverridden.Values)
_gl.DeleteTexture(h);
_handlesByOverridden.Clear();
foreach (var h in _handlesByPalette.Values)
_gl.DeleteTexture(h);
_handlesByPalette.Clear();
if (_magentaHandle != 0)
{
_gl.DeleteTexture(_magentaHandle);
_magentaHandle = 0;
}
}
- Step 4.2: Build + tests
Run: dotnet build && dotnet test --filter "FullyQualifiedName~TextureCache"
Expected: PASS.
- Step 4.3: Commit
phase(N.5) Task 4: TextureCache.Dispose releases bindless handles first
Iterating _bindlessHandlesByGlName + MakeNonResident before any
glDeleteTexture call, per ARB_bindless_texture spec — deleting a
texture with a resident handle is undefined behavior. Order: bindless
release → texture delete → magenta cleanup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 5: Create mesh_modern.vert + mesh_modern.frag
Files:
- Create:
src/AcDream.App/Rendering/Shaders/mesh_modern.vert - Create:
src/AcDream.App/Rendering/Shaders/mesh_modern.frag
Both files must be added to <Content> <CopyToOutputDirectory> block in AcDream.App.csproj if shaders aren't auto-included. Check the existing pattern in the csproj — the existing mesh_instanced.vert/.frag should already be there.
- Step 5.1: Read csproj content includes
Read src/AcDream.App/AcDream.App.csproj. Find the <Content> block(s) that include *.vert / *.frag files. Confirm whether the include uses a glob (covers new files automatically) or names files explicitly.
If glob: nothing to do. If explicit: add mesh_modern.vert + mesh_modern.frag entries.
- Step 5.2: Write mesh_modern.vert
Create src/AcDream.App/Rendering/Shaders/mesh_modern.vert:
#version 430 core
#extension GL_ARB_bindless_texture : require
#extension GL_ARB_shader_draw_parameters : require
layout(location = 0) in vec3 aPosition;
layout(location = 1) in vec3 aNormal;
layout(location = 2) in vec2 aTexCoord;
struct InstanceData {
mat4 transform;
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful highlight):
// vec4 highlightColor;
// When implementing, extend stride here, increase _instanceSsbo upload
// size in WbDrawDispatcher, add a flat varying out, and consume in frag.
};
struct BatchData {
uvec2 textureHandle; // bindless handle for sampler2DArray
uint textureLayer; // layer index (always 0 for per-instance composites)
uint flags; // reserved
};
layout(std430, binding = 0) readonly buffer InstanceBuffer {
InstanceData Instances[];
};
layout(std430, binding = 1) readonly buffer BatchBuffer {
BatchData Batches[];
};
uniform mat4 uViewProjection;
out vec3 vNormal;
out vec2 vTexCoord;
out flat uvec2 vTextureHandle;
out flat uint vTextureLayer;
void main() {
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
mat4 model = Instances[instanceIndex].transform;
vec4 worldPos = model * vec4(aPosition, 1.0);
gl_Position = uViewProjection * worldPos;
vNormal = normalize(mat3(model) * aNormal);
vTexCoord = aTexCoord;
BatchData b = Batches[gl_DrawIDARB];
vTextureHandle = b.textureHandle;
vTextureLayer = b.textureLayer;
}
- Step 5.3: Write mesh_modern.frag
Create src/AcDream.App/Rendering/Shaders/mesh_modern.frag:
#version 430 core
#extension GL_ARB_bindless_texture : require
in vec3 vNormal;
in vec2 vTexCoord;
in flat uvec2 vTextureHandle;
in flat uint vTextureLayer;
uniform int uRenderPass; // 0 = opaque (discard alpha<0.95), 1 = transparent (discard alpha>=0.95)
uniform vec3 uAmbient;
uniform vec3 uSunDir;
uniform vec3 uSunColor;
out vec4 FragColor;
void main() {
sampler2DArray tex = sampler2DArray(vTextureHandle);
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
if (uRenderPass == 0) {
// Opaque pass: discard soft pixels — they belong to the transparent pass.
if (color.a < 0.95) discard;
} else {
// Transparent pass: discard hard pixels (already drawn opaque).
if (color.a >= 0.95) discard;
if (color.a < 0.05) discard; // skip totally-empty fragments
}
vec3 N = normalize(vNormal);
vec3 L = normalize(uSunDir);
float diff = max(dot(N, L), 0.0);
vec3 lit = uAmbient + uSunColor * diff;
color.rgb *= clamp(lit, 0.0, 1.0);
FragColor = color;
}
Note: this initial version uses uniform vec3 for the lighting params instead of a UBO. This matches the existing mesh_instanced.frag pattern (verify by reading it). If mesh_instanced.frag actually uses a UBO, change to match.
- Step 5.4: Read existing mesh_instanced.frag to verify lighting layout
Read src/AcDream.App/Rendering/Shaders/mesh_instanced.frag. Compare its lighting uniform shape to the version above. Adjust mesh_modern.frag to match (UBO if existing uses UBO, vec3 uniforms if existing uses uniforms).
- Step 5.5: Build to verify shaders are copied to output
Run: dotnet build src/AcDream.App/AcDream.App.csproj
Expected: PASS. After build, check src/AcDream.App/bin/Debug/net10.0/Rendering/Shaders/ contains mesh_modern.vert + mesh_modern.frag.
- Step 5.6: Commit
phase(N.5) Task 5: mesh_modern.vert + .frag — bindless + SSBO + indirect
New entity shaders modeled on WB's StaticObjectModern.* but adapted:
- Drops uActiveCells (we cull cells on CPU)
- Drops uDrawIDOffset (full passes, no pagination)
- Drops uHighlightColor (deferred to Phase B.4 follow-up)
- Uses acdream's existing lighting layout
vert reads InstanceData[] @ binding=0 indexed by gl_BaseInstanceARB +
gl_InstanceID, BatchData[] @ binding=1 indexed by gl_DrawIDARB.
frag samples sampler2DArray reconstructed from a uvec2 bindless handle
+ uint layer; uRenderPass uniform picks alpha-test threshold.
Not yet wired to the dispatcher — Task 7 swaps shader load,
Tasks 9-10 swap the draw loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 6: Wire mesh_modern shader load + capability check in GameWindow
Files:
-
Modify:
src/AcDream.App/Rendering/GameWindow.cs -
Step 6.1: Read existing mesh_instanced load site
Read src/AcDream.App/Rendering/GameWindow.cs:960-980 (around the _meshShader = new Shader(...) line). Note the surrounding context — the WB foundation flag check, how the dispatcher is constructed.
- Step 6.2: Add capability-gated mesh_modern load
Find this block:
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
Replace with:
// N.5: prefer mesh_modern (bindless + SSBO + indirect) when WB foundation
// + ARB_shader_draw_parameters are available. Falls back to legacy
// mesh_instanced if any capability is missing — same code path as
// ACDREAM_USE_WB_FOUNDATION=0.
bool wbFoundationOn = WbFoundationFlag.IsEnabled;
bool useModernShader = false;
if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null)
{
if (bindless.HasShaderDrawParameters(_gl))
{
try
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_modern.vert"),
Path.Combine(shadersDir, "mesh_modern.frag"));
_bindlessSupport = bindless;
useModernShader = true;
Console.WriteLine("[N.5] mesh_modern shader loaded (bindless + ARB_shader_draw_parameters)");
}
catch (Exception ex)
{
Console.WriteLine($"[N.5] mesh_modern compile failed, falling back: {ex.Message}");
}
}
else
{
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present, using legacy shader");
}
}
if (!useModernShader)
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
_bindlessSupport = null;
}
Add the _bindlessSupport field declaration alongside _meshShader:
private BindlessSupport? _bindlessSupport;
Also add using AcDream.App.Rendering.Wb; at the top of the file if not already there.
- Step 6.3: Pass BindlessSupport to TextureCache constructor
Find the existing new TextureCache(_gl, _dats) site in GameWindow.cs. Replace with:
_textureCache = new TextureCache(_gl, _dats, _bindlessSupport);
This requires _bindlessSupport to already be set. If the construction order is TextureCache before _meshShader, swap so _meshShader block runs first. Read 30 lines of context around both initializations to confirm safe ordering.
- Step 6.4: Build + smoke test
Run: dotnet build
Expected: PASS.
Run: dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"
Expected: 60+ tests PASS.
Smoke launch (manual, optional at this point — modern shader loaded but dispatcher still uses legacy draw path so visual should be identical to N.4):
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task6.log
Expected: launch logs show [N.5] mesh_modern shader loaded line. Visual is broken (modern shader is loaded but dispatcher's per-group draw loop hands it the wrong data layout) — this is fine, expected, and gets fixed in Tasks 7-10.
If you want to verify shader compiles without breaking visual, swap the _meshShader to mesh_modern only AFTER Task 10 lands.
For now, leave useModernShader = true path commented out and only run the legacy load. Tasks 9-10 flip it on. Update the block:
if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null)
{
if (bindless.HasShaderDrawParameters(_gl))
{
// Capability detected — store the support for later tasks.
// Shader swap happens in Task 10 once dispatcher is ready.
_bindlessSupport = bindless;
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
}
}
// Legacy shader load happens unconditionally for Task 6:
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
Task 10 will switch the shader load. Task 6 just plumbs _bindlessSupport so Task 7+ can use it.
- Step 6.5: Commit
phase(N.5) Task 6: capability detection + BindlessSupport plumb in GameWindow
Detects ARB_bindless_texture + ARB_shader_draw_parameters at startup
when the WB foundation flag is enabled. Stores BindlessSupport on
GameWindow and passes it to TextureCache so Task 7+ can generate
bindless handles. Mesh shader load remains mesh_instanced for now —
Task 10 swaps to mesh_modern after the dispatcher is rewired.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 7: Add SSBO + indirect buffer infrastructure to WbDrawDispatcher
Files:
-
Modify:
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs -
Create:
src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs -
Step 7.1: Create DrawElementsIndirectCommand struct
Create src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs:
using System.Runtime.InteropServices;
namespace AcDream.App.Rendering.Wb;
/// <summary>
/// Layout matches what <c>glMultiDrawElementsIndirect</c> expects.
/// Total size 20 bytes; arrays are typically uploaded with stride = sizeof(this).
/// </summary>
[StructLayout(LayoutKind.Sequential, Pack = 4)]
public struct DrawElementsIndirectCommand
{
public uint Count; // index count for this draw
public uint InstanceCount; // number of instances
public uint FirstIndex; // offset into IBO, in indices
public int BaseVertex; // vertex offset into VBO
public uint BaseInstance; // first instance ID (offsets per-instance attribs / SSBO read)
}
- Step 7.2: Add SSBO + indirect buffer fields + BatchData struct to WbDrawDispatcher
In src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs, add at the top of the class (replacing the existing _instanceVbo field):
private readonly BindlessSupport _bindless;
// SSBO buffer ids
private uint _instanceSsbo;
private uint _batchSsbo;
private uint _indirectBuffer;
// Per-frame scratch arrays
private float[] _instanceData = new float[256 * 16]; // mat4 floats per instance
private BatchData[] _batchData = new BatchData[256];
private DrawElementsIndirectCommand[] _indirectCommands = new DrawElementsIndirectCommand[256];
private int _opaqueDrawCount;
private int _transparentDrawCount;
private int _transparentByteOffset;
[StructLayout(LayoutKind.Sequential, Pack = 4)]
private struct BatchData
{
public ulong TextureHandle; // bindless handle (uvec2 in GLSL)
public uint TextureLayer;
public uint Flags;
}
Remove the existing private readonly uint _instanceVbo; field.
- Step 7.3: Update constructor
Change the constructor signature from:
public WbDrawDispatcher(
GL gl,
Shader shader,
TextureCache textures,
WbMeshAdapter meshAdapter,
EntitySpawnAdapter entitySpawnAdapter)
to:
public WbDrawDispatcher(
GL gl,
Shader shader,
TextureCache textures,
WbMeshAdapter meshAdapter,
EntitySpawnAdapter entitySpawnAdapter,
BindlessSupport bindless)
In the body, replace _instanceVbo = _gl.GenBuffer(); with:
_bindless = bindless ?? throw new ArgumentNullException(nameof(bindless));
_instanceSsbo = _gl.GenBuffer();
_batchSsbo = _gl.GenBuffer();
_indirectBuffer = _gl.GenBuffer();
- Step 7.4: Update Dispose
Replace the existing Dispose() body:
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_gl.DeleteBuffer(_instanceSsbo);
_gl.DeleteBuffer(_batchSsbo);
_gl.DeleteBuffer(_indirectBuffer);
}
- Step 7.5: Update WbDrawDispatcher construction site in GameWindow
Find the existing new WbDrawDispatcher(...) call in GameWindow.cs and add the _bindlessSupport! argument (the ! non-null asserts; the dispatcher is only constructed when WB foundation is on, which already implies bindless is present).
- Step 7.6: Build + tests
Run: dotnet build
Expected: PASS.
Run: dotnet test --filter "FullyQualifiedName~Wb"
Expected: PASS (existing tests don't exercise the changed buffer plumbing yet — we removed _instanceVbo but we'll restore the draw path in Task 9).
If WbDrawDispatcher.Draw references _instanceVbo, those references break. Comment out the body of Draw() temporarily — it'll be rewritten in Tasks 9-10. Wrap with // TASK 9-10: rewriting. Build must still pass.
Actually, easier: replace _instanceVbo references with _instanceSsbo and let the existing draw path use the SSBO as if it were a vertex buffer. The legacy draw will be functionally broken but compile. Visual will break but only after we flip the shader in Task 10. For the scope of Tasks 7-9 we want the build to compile.
The cleanest pattern: leave the existing Draw() method untouched except for substituting _instanceVbo → _instanceSsbo. The behavior is wrong but compiles, and Tasks 9-10 fully rewrite it.
- Step 7.7: Commit
phase(N.5) Task 7: dispatcher SSBO + indirect buffer infrastructure
Adds DrawElementsIndirectCommand struct (20-byte layout for
glMultiDrawElementsIndirect). Replaces _instanceVbo field on
WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]),
_batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport
constructor parameter — non-null required since the dispatcher is only
constructed when WB foundation is on.
Existing Draw() method substitutes _instanceVbo → _instanceSsbo for
compile. Behavior temporarily wrong; Tasks 9-10 fully rewrite the
draw loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 8: Update InstanceGroup + GroupKey for bindless handles
Files:
-
Modify:
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs -
Step 8.1: Update InstanceGroup
In WbDrawDispatcher.cs, replace the existing InstanceGroup class with:
private sealed class InstanceGroup
{
public uint Ibo;
public uint FirstIndex;
public int BaseVertex;
public int IndexCount;
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
public uint TextureLayer; // 0 for per-instance composites
public TranslucencyKind Translucency;
public int FirstInstance;
public int InstanceCount;
public float SortDistance;
public readonly List<Matrix4x4> Matrices = new();
}
- Step 8.2: Update GroupKey
Replace the GroupKey record:
private readonly record struct GroupKey(
uint Ibo,
uint FirstIndex,
int BaseVertex,
int IndexCount,
ulong BindlessTextureHandle,
uint TextureLayer,
TranslucencyKind Translucency);
- Step 8.3: Update ResolveTexture method
Replace the existing ResolveTexture method (returns uint) with:
private ulong ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
{
uint surfaceId = batch.Key.SurfaceId;
if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0;
uint overrideOrigTex = 0;
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
&& meshRef.SurfaceOverrides.TryGetValue(surfaceId, out overrideOrigTex);
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
if (entity.PaletteOverride is not null)
{
return _textures.GetOrUploadWithPaletteOverrideBindless(
surfaceId, origTexOverride, entity.PaletteOverride, palHash);
}
else if (hasOrigTexOverride)
{
return _textures.GetOrUploadWithOrigTextureOverrideBindless(surfaceId, overrideOrigTex);
}
else
{
return _textures.GetOrUploadBindless(surfaceId);
}
}
- Step 8.4: Update ClassifyBatches to use the new return type
Replace the existing ClassifyBatches to use ulong texHandle and pass the layer:
private void ClassifyBatches(
ObjectRenderData renderData,
ulong gfxObjId,
Matrix4x4 model,
WorldEntity entity,
MeshRef meshRef,
ulong palHash,
AcSurfaceMetadataTable metaTable)
{
for (int batchIdx = 0; batchIdx < renderData.Batches.Count; batchIdx++)
{
var batch = renderData.Batches[batchIdx];
TranslucencyKind translucency;
if (metaTable.TryLookup(gfxObjId, batchIdx, out var meta))
{
translucency = meta.Translucency;
}
else
{
translucency = batch.IsAdditive ? TranslucencyKind.Additive
: batch.IsTransparent ? TranslucencyKind.AlphaBlend
: TranslucencyKind.Opaque;
}
ulong texHandle = ResolveTexture(entity, meshRef, batch, palHash);
if (texHandle == 0) continue;
// For per-instance composites we use 1-layer Texture2DArray, layer always 0.
// When N.6 adopts WB's atlas, this becomes batch's layer index.
uint texLayer = 0;
var key = new GroupKey(
batch.IBO, batch.FirstIndex, (int)batch.BaseVertex,
batch.IndexCount, texHandle, texLayer, translucency);
if (!_groups.TryGetValue(key, out var grp))
{
grp = new InstanceGroup
{
Ibo = batch.IBO,
FirstIndex = batch.FirstIndex,
BaseVertex = (int)batch.BaseVertex,
IndexCount = batch.IndexCount,
BindlessTextureHandle = texHandle,
TextureLayer = texLayer,
Translucency = translucency,
};
_groups[key] = grp;
}
grp.Matrices.Add(model);
}
}
- Step 8.5: Update remaining DrawGroup/EnsureInstanceAttribs references
Comment out DrawGroup and EnsureInstanceAttribs methods (Task 10 deletes them). Also comment out their call sites in Draw(). Build will fail until Task 9-10 lands; that's expected.
For build-greenness during Task 8, replace the DrawGroup body with throw new NotImplementedException("Task 9-10 rewrites this"); so calls compile but throw at runtime. Visual will be broken until Task 10. That's expected.
Update the Draw() method's per-group loop to compile:
foreach (var grp in _opaqueDraws)
{
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
DrawGroup(grp); // throws — Task 10 fixes
}
(The user does NOT visually verify at this task. Build green only.)
- Step 8.6: Build
Run: dotnet build
Expected: PASS.
Run: dotnet test --filter "FullyQualifiedName~Wb"
Expected: existing tests PASS (they're CPU-only — they don't actually invoke DrawGroup).
- Step 8.7: Commit
phase(N.5) Task 8: InstanceGroup + GroupKey carry bindless handle + layer
Replaces uint TextureHandle (32-bit GL name) with ulong
BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture
return type. Adds TextureLayer (always 0 for per-instance composites,
becomes meaningful when WB atlas is adopted in N.6).
ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants.
DrawGroup body throws NotImplementedException — Task 9-10 rewrites
the draw loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 9: Build BatchData + DEIC arrays per frame (TDD)
Files:
- Modify:
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs - Create:
tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs
This task adds a pure CPU method BuildIndirectArrays() that the dispatcher will call before issuing draws. Unit-testable without GL context.
- Step 9.1: Write the failing test
Create tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs:
using System.Numerics;
using AcDream.App.Rendering.Wb;
using AcDream.Core.Meshing;
using Xunit;
namespace AcDream.Core.Tests.Rendering.Wb;
/// <summary>
/// Pure CPU test of <see cref="WbDrawDispatcher.BuildIndirectArrays"/>.
/// Builds a synthetic group set and verifies the laid-out indirect commands
/// match the spec §5 walk-through.
/// </summary>
public sealed class WbDrawDispatcherIndirectBuilderTests
{
[Fact]
public void TwoOpaqueGroupsAndOneTransparent_LaysOutContiguouslyOpaqueFirst()
{
// Arrange — synthetic groups laid out as in spec §5
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
{
new(IndexCount: 100, FirstIndex: 0, BaseVertex: 0, InstanceCount: 12, FirstInstance: 0, TextureHandle: 0xAA, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
new(IndexCount: 200, FirstIndex: 100, BaseVertex: 0, InstanceCount: 12, FirstInstance: 12, TextureHandle: 0xBB, TextureLayer: 0, Translucency: TranslucencyKind.AlphaBlend),
new(IndexCount: 50, FirstIndex: 300, BaseVertex: 100, InstanceCount: 1, FirstInstance: 24, TextureHandle: 0xCC, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
};
var indirect = new DrawElementsIndirectCommand[16];
var batch = new WbDrawDispatcher.BatchDataPublic[16];
// Act
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
// Assert layout
Assert.Equal(2, result.OpaqueCount);
Assert.Equal(1, result.TransparentCount);
Assert.Equal(2 * 20, result.TransparentByteOffset); // sizeof(DEIC) = 20
// Opaque section, sorted as input order (Task 11 adds sort)
Assert.Equal(100u, indirect[0].Count);
Assert.Equal(0u, indirect[0].FirstIndex);
Assert.Equal(0, indirect[0].BaseVertex);
Assert.Equal(12u, indirect[0].InstanceCount);
Assert.Equal(0u, indirect[0].BaseInstance);
Assert.Equal(50u, indirect[1].Count);
Assert.Equal(300u, indirect[1].FirstIndex);
Assert.Equal(100, indirect[1].BaseVertex);
Assert.Equal(1u, indirect[1].InstanceCount);
Assert.Equal(24u, indirect[1].BaseInstance);
// Transparent section
Assert.Equal(200u, indirect[2].Count);
Assert.Equal(100u, indirect[2].FirstIndex);
Assert.Equal(12u, indirect[2].InstanceCount);
Assert.Equal(12u, indirect[2].BaseInstance);
// BatchData parallel
Assert.Equal(0xAAul, batch[0].TextureHandle);
Assert.Equal(0xCCul, batch[1].TextureHandle);
Assert.Equal(0xBBul, batch[2].TextureHandle);
}
[Fact]
public void EmptyGroupList_ProducesZeroCounts()
{
var groups = new List<WbDrawDispatcher.IndirectGroupInput>();
var indirect = new DrawElementsIndirectCommand[0];
var batch = new WbDrawDispatcher.BatchDataPublic[0];
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
Assert.Equal(0, result.OpaqueCount);
Assert.Equal(0, result.TransparentCount);
Assert.Equal(0, result.TransparentByteOffset);
}
}
- Step 9.2: Run, verify it fails
Run: dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"
Expected: COMPILE FAIL — BuildIndirectArrays and supporting public types don't exist.
- Step 9.3: Implement BuildIndirectArrays + supporting types
In src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs, add public helper types + static method (above the private InstanceGroup class):
/// <summary>Public view of the per-group inputs to <see cref="BuildIndirectArrays"/> — used in tests.</summary>
public readonly record struct IndirectGroupInput(
int IndexCount,
uint FirstIndex,
int BaseVertex,
int InstanceCount,
int FirstInstance,
ulong TextureHandle,
uint TextureLayer,
TranslucencyKind Translucency);
/// <summary>Public mirror of the per-group BatchData laid into the SSBO. Tests verify alignment.</summary>
[StructLayout(LayoutKind.Sequential, Pack = 4)]
public struct BatchDataPublic
{
public ulong TextureHandle;
public uint TextureLayer;
public uint Flags;
}
public readonly record struct IndirectLayoutResult(
int OpaqueCount,
int TransparentCount,
int TransparentByteOffset);
/// <summary>
/// Lays out the indirect commands + parallel BatchData array contiguously:
/// opaque section first, transparent section second. Pure CPU, no GL state.
/// Caller passes scratch arrays (pre-sized).
/// </summary>
public static IndirectLayoutResult BuildIndirectArrays(
IReadOnlyList<IndirectGroupInput> groups,
DrawElementsIndirectCommand[] indirectScratch,
BatchDataPublic[] batchScratch)
{
int opaqueCount = 0;
int transparentCount = 0;
// First pass: count
foreach (var g in groups)
{
if (IsOpaque(g.Translucency)) opaqueCount++;
else transparentCount++;
}
// Second pass: lay out — opaque [0..opaqueCount), transparent [opaqueCount..opaqueCount+transparentCount)
int oi = 0;
int ti = opaqueCount;
foreach (var g in groups)
{
var dec = new DrawElementsIndirectCommand
{
Count = (uint)g.IndexCount,
InstanceCount = (uint)g.InstanceCount,
FirstIndex = g.FirstIndex,
BaseVertex = g.BaseVertex,
BaseInstance = (uint)g.FirstInstance,
};
var bd = new BatchDataPublic
{
TextureHandle = g.TextureHandle,
TextureLayer = g.TextureLayer,
Flags = 0,
};
if (IsOpaque(g.Translucency))
{
indirectScratch[oi] = dec;
batchScratch[oi] = bd;
oi++;
}
else
{
indirectScratch[ti] = dec;
batchScratch[ti] = bd;
ti++;
}
}
int sizeofDEIC = 20; // matches struct layout
return new IndirectLayoutResult(opaqueCount, transparentCount, opaqueCount * sizeofDEIC);
}
private static bool IsOpaque(TranslucencyKind t)
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
- Step 9.4: Run test, verify pass
Run: dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"
Expected: PASS (2 tests).
Run full filter: dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"
Expected: 60+ existing tests + 2 new = PASS.
- Step 9.5: Commit
phase(N.5) Task 9: BuildIndirectArrays — CPU layout for indirect dispatch
Pure CPU helper that lays out a group list into a contiguous indirect
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
opaque section first, transparent section second. Returns counts +
byte offset for the transparent section.
Tests cover the spec §5 walk-through layout: per-group fields propagate
correctly, opaque/transparent partition lands at the expected indices.
Static + public so tests can exercise without a GL context. Tasks
10-11 wire it into Draw().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 10: Replace draw loop with glMultiDrawElementsIndirect (visual verification)
Files:
- Modify:
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs - Modify:
src/AcDream.App/Rendering/GameWindow.cs
This is the load-bearing task. After this lands, visual verification is required.
- Step 10.1: Rewrite WbDrawDispatcher.Draw
Replace the entire Draw() method body in src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs. The phase 1-3 (entity walk, group bucketing, matrix layout) stay; phases 4-6 are rewritten:
public unsafe void Draw(
ICamera camera,
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
FrustumPlanes? frustum = null,
uint? neverCullLandblockId = null,
HashSet<uint>? visibleCellIds = null,
HashSet<uint>? animatedEntityIds = null)
{
_shader.Use();
var vp = camera.View * camera.Projection;
_shader.SetMatrix4("uViewProjection", vp);
// Lighting uniforms — match what mesh_modern.frag declares (Task 5.3).
// Read the existing N.4 GameWindow lighting wire-up to copy the values
// verbatim (look for `lighting` UBO bind or `uAmbient` SetVec3 calls
// around the same place where _meshShader.Use() / SetMatrix4 happens).
// If N.4 used a UBO: change mesh_modern.frag in Task 5.3 to match the UBO,
// then bind the UBO here via `_gl.BindBufferBase(UniformBuffer, 1, lightingUbo)`.
// If N.4 used uniforms: replicate the same SetVec3 calls here.
bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal);
Vector3 camPos = Vector3.Zero;
if (Matrix4x4.Invert(camera.View, out var invView))
camPos = invView.Translation;
// ── Phases 1-2: walk entities, build groups, lay matrices ───────────
foreach (var grp in _groups.Values) grp.Matrices.Clear();
var metaTable = _meshAdapter.MetadataTable;
uint anyVao = 0;
foreach (var entry in landblockEntries)
{
bool landblockVisible = frustum is null
|| entry.LandblockId == neverCullLandblockId
|| FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax);
if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
continue;
foreach (var entity in entry.Entities)
{
if (entity.MeshRefs.Count == 0) continue;
bool isAnimated = animatedEntityIds?.Contains(entity.Id) == true;
if (!landblockVisible && !isAnimated) continue;
if (entity.ParentCellId.HasValue && visibleCellIds is not null
&& !visibleCellIds.Contains(entity.ParentCellId.Value))
continue;
if (frustum is not null && !isAnimated && entry.LandblockId != neverCullLandblockId)
{
var p = entity.Position;
var aMin = new Vector3(p.X - PerEntityCullRadius, p.Y - PerEntityCullRadius, p.Z - PerEntityCullRadius);
var aMax = new Vector3(p.X + PerEntityCullRadius, p.Y + PerEntityCullRadius, p.Z + PerEntityCullRadius);
if (!FrustumCuller.IsAabbVisible(frustum.Value, aMin, aMax))
continue;
}
if (diag) _entitiesSeen++;
var entityWorld =
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
Matrix4x4.CreateTranslation(entity.Position);
ulong palHash = 0;
if (entity.PaletteOverride is not null)
palHash = TextureCache.HashPaletteOverride(entity.PaletteOverride);
bool drewAny = false;
for (int partIdx = 0; partIdx < entity.MeshRefs.Count; partIdx++)
{
var meshRef = entity.MeshRefs[partIdx];
ulong gfxObjId = meshRef.GfxObjId;
var renderData = _meshAdapter.TryGetRenderData(gfxObjId);
if (renderData is null) { if (diag) _meshesMissing++; continue; }
drewAny = true;
if (anyVao == 0) anyVao = renderData.VAO;
if (renderData.IsSetup && renderData.SetupParts.Count > 0)
{
foreach (var (partGfxObjId, partTransform) in renderData.SetupParts)
{
var partData = _meshAdapter.TryGetRenderData(partGfxObjId);
if (partData is null) continue;
var model = ComposePartWorldMatrix(entityWorld, meshRef.PartTransform, partTransform);
ClassifyBatches(partData, partGfxObjId, model, entity, meshRef, palHash, metaTable);
}
}
else
{
var model = meshRef.PartTransform * entityWorld;
ClassifyBatches(renderData, gfxObjId, model, entity, meshRef, palHash, metaTable);
}
}
if (diag && drewAny) _entitiesDrawn++;
}
}
if (anyVao == 0) { if (diag) MaybeFlushDiag(); return; }
int totalInstances = 0;
foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count;
if (totalInstances == 0) { if (diag) MaybeFlushDiag(); return; }
// ── Phase 3: assign FirstInstance per group, lay matrices contiguous ─
int needed = totalInstances * 16;
if (_instanceData.Length < needed)
_instanceData = new float[needed + 256 * 16];
_opaqueDraws.Clear();
_translucentDraws.Clear();
int cursor = 0;
foreach (var grp in _groups.Values)
{
if (grp.Matrices.Count == 0) continue;
grp.FirstInstance = cursor;
grp.InstanceCount = grp.Matrices.Count;
var first = grp.Matrices[0];
var grpPos = new Vector3(first.M41, first.M42, first.M43);
grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos);
for (int i = 0; i < grp.Matrices.Count; i++)
{
WriteMatrix(_instanceData, cursor * 16, grp.Matrices[i]);
cursor++;
}
if (IsOpaqueGroup(grp.Translucency))
_opaqueDraws.Add(grp);
else
_translucentDraws.Add(grp);
}
_opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance));
// ── Phase 4: build BatchData + DEIC arrays ──────────────────────────
int totalDraws = _opaqueDraws.Count + _translucentDraws.Count;
if (_batchData.Length < totalDraws)
_batchData = new BatchData[totalDraws + 64];
if (_indirectCommands.Length < totalDraws)
_indirectCommands = new DrawElementsIndirectCommand[totalDraws + 64];
var groupInputs = new List<IndirectGroupInput>(totalDraws);
foreach (var g in _opaqueDraws) groupInputs.Add(ToInput(g));
foreach (var g in _translucentDraws) groupInputs.Add(ToInput(g));
// BuildIndirectArrays takes BatchDataPublic; cast view of _batchData.
// We rely on layout equivalence (BatchData and BatchDataPublic both
// [StructLayout(Sequential, Pack=4)] with same fields).
var batchView = MemoryMarshal.Cast<BatchData, BatchDataPublic>(_batchData);
var layout = BuildIndirectArrays(groupInputs, _indirectCommands, batchView.ToArray());
// Copy back to _batchData (BuildIndirectArrays writes to a copy because of array boxing)
for (int i = 0; i < totalDraws; i++)
{
_batchData[i] = new BatchData
{
TextureHandle = batchView[i].TextureHandle,
TextureLayer = batchView[i].TextureLayer,
Flags = batchView[i].Flags,
};
}
_opaqueDrawCount = layout.OpaqueCount;
_transparentDrawCount = layout.TransparentCount;
_transparentByteOffset = layout.TransparentByteOffset;
// ── Phase 5: upload three buffers ───────────────────────────────────
fixed (float* ip = _instanceData)
UploadSsbo(_instanceSsbo, 0, ip, totalInstances * 16 * sizeof(float));
fixed (BatchData* bp = _batchData)
UploadSsbo(_batchSsbo, 1, bp, totalDraws * sizeof(BatchData));
fixed (DrawElementsIndirectCommand* cp = _indirectCommands)
{
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
_gl.BufferData(BufferTargetARB.DrawIndirectBuffer,
(nuint)(totalDraws * sizeof(DrawElementsIndirectCommand)), cp, BufferUsageARB.DynamicDraw);
}
// ── Phase 6: bind global VAO once ───────────────────────────────────
_gl.BindVertexArray(anyVao);
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
_gl.Disable(EnableCap.CullFace);
// ── Phase 7: opaque pass ───────────────────────────────────────────
if (_opaqueDrawCount > 0)
{
_gl.Disable(EnableCap.Blend);
_gl.DepthMask(true);
_shader.SetInt("uRenderPass", 0);
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
_gl.MultiDrawElementsIndirect(
PrimitiveType.Triangles,
DrawElementsType.UnsignedShort,
indirect: (void*)0,
drawcount: (uint)_opaqueDrawCount,
stride: (uint)sizeof(DrawElementsIndirectCommand));
}
// ── Phase 8: transparent pass ──────────────────────────────────────
if (_transparentDrawCount > 0)
{
_gl.Enable(EnableCap.Blend);
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
_gl.DepthMask(false);
_shader.SetInt("uRenderPass", 1);
_gl.MultiDrawElementsIndirect(
PrimitiveType.Triangles,
DrawElementsType.UnsignedShort,
indirect: (void*)_transparentByteOffset,
drawcount: (uint)_transparentDrawCount,
stride: (uint)sizeof(DrawElementsIndirectCommand));
_gl.DepthMask(true);
_gl.Disable(EnableCap.Blend);
}
_gl.Disable(EnableCap.CullFace);
_gl.BindVertexArray(0);
if (diag)
{
_drawsIssued += _opaqueDrawCount + _transparentDrawCount;
_instancesIssued += totalInstances;
MaybeFlushDiag();
}
}
private static bool IsOpaqueGroup(TranslucencyKind t)
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
private static IndirectGroupInput ToInput(InstanceGroup g) => new(
IndexCount: g.IndexCount,
FirstIndex: g.FirstIndex,
BaseVertex: g.BaseVertex,
InstanceCount: g.InstanceCount,
FirstInstance: g.FirstInstance,
TextureHandle: g.BindlessTextureHandle,
TextureLayer: g.TextureLayer,
Translucency: g.Translucency);
private unsafe void UploadSsbo(uint ssbo, uint binding, void* data, int byteCount)
{
_gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, ssbo);
_gl.BufferData(BufferTargetARB.ShaderStorageBuffer, (nuint)byteCount, data, BufferUsageARB.DynamicDraw);
_gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, binding, ssbo);
}
Delete the old DrawGroup, EnsureInstanceAttribs, and ResolveTexture (the old uint-returning version) methods — they're no longer called.
- Step 10.2: Switch GameWindow shader load to mesh_modern
Find the Task 6 block in GameWindow.cs and change the shader load from mesh_instanced to mesh_modern when _bindlessSupport != null:
if (_bindlessSupport is not null)
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_modern.vert"),
Path.Combine(shadersDir, "mesh_modern.frag"));
Console.WriteLine("[N.5] mesh_modern shader loaded");
}
else
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_instanced.vert"),
Path.Combine(shadersDir, "mesh_instanced.frag"));
}
- Step 10.3: Build + run all tests
Run: dotnet build
Expected: PASS.
Run: dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"
Expected: 60+ tests + 2 new BuildIndirectArrays tests PASS.
- Step 10.4: Visual smoke test (USER GATE)
Launch:
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_WB_DIAG = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task10.log
Expected:
- Console shows
[N.5] mesh_modern shader loaded. - Holtburg renders with characters + scenery + buildings visible.
[WB-DIAG]shows draws dropping from N.4's hundreds to ~3-5 per frame for entity rendering.
User confirms visual identity. If broken, debug — most likely failure modes:
- Shader compile failure → console log will show GLSL info log; fix vert/frag.
- Black textures everywhere → bindless handle generation broken; check
_bindlessis non-null in TextureCache. - Wrong geometry → BaseVertex / FirstIndex misaligned; verify against N.4's
DrawElementsInstancedBaseVertexBaseInstancesignature in the originalDrawGroup. - Wrong matrices on entities → InstanceSsbo upload size wrong; verify
totalInstances * 16 * sizeof(float).
- Step 10.5: Commit only after visual verification passes
phase(N.5) Task 10: glMultiDrawElementsIndirect dispatch — visual verified
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
Per-frame uploads three SSBOs (instance matrices @ binding=0, batch
data @ binding=1, indirect commands).
Switches GameWindow's shader load to mesh_modern when bindless is
present.
Visual verification: Holtburg courtyard renders identical to N.4.
Entity draw calls drop from "few hundred per pass" to 1 per pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 11: Update ClassifyBatches for translucency restructure (TDD)
Files:
- Modify:
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs - Create:
tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs
Per Decision 2: Additive and InvAlpha merge into transparent (alpha-blend). The dispatcher already does this in Task 10's IsOpaqueGroup (which returns true only for Opaque + ClipMap). This task ADDS a unit test and tightens the contract.
- Step 11.1: Write the failing test
Create tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs:
using AcDream.App.Rendering.Wb;
using AcDream.Core.Meshing;
using Xunit;
namespace AcDream.Core.Tests.Rendering.Wb;
/// <summary>
/// Locks in the N.5 translucency partition contract (Decision 2):
/// Opaque + ClipMap → opaque indirect; AlphaBlend + Additive + InvAlpha → transparent.
/// </summary>
public sealed class WbDrawDispatcherTranslucencyTests
{
[Theory]
[InlineData(TranslucencyKind.Opaque, true)]
[InlineData(TranslucencyKind.ClipMap, true)]
[InlineData(TranslucencyKind.AlphaBlend, false)]
[InlineData(TranslucencyKind.Additive, false)]
[InlineData(TranslucencyKind.InvAlpha, false)]
public void IsOpaque_PartitionsByKind(TranslucencyKind kind, bool expected)
{
Assert.Equal(expected, WbDrawDispatcher.IsOpaquePublic(kind));
}
}
- Step 11.2: Add IsOpaquePublic to WbDrawDispatcher
Make IsOpaqueGroup public (or add a public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t); shim):
public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t);
- Step 11.3: Run test, verify PASS
Run: dotnet test --filter "FullyQualifiedName~WbDrawDispatcherTranslucency"
Expected: 5 tests PASS.
Run all: dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"
Expected: 60+ + 2 + 5 = 67+ PASS.
- Step 11.4: Commit
phase(N.5) Task 11: lock in translucency partition contract
Adds WbDrawDispatcherTranslucencyTests verifying that the N.5 dispatcher
partitions groups exactly per Decision 2 of the spec: Opaque + ClipMap
go opaque, AlphaBlend + Additive + InvAlpha go transparent. Catches
future refactors that drift the partition.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 12: Add CPU stopwatch + GL timer query timing in [WB-DIAG]
Files:
-
Modify:
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs -
Step 12.1: Add timing fields
In WbDrawDispatcher.cs, add to the diagnostic-counter block:
// CPU + GPU timing for [WB-DIAG] under ACDREAM_WB_DIAG=1
private readonly System.Diagnostics.Stopwatch _cpuStopwatch = new();
private readonly long[] _cpuSamples = new long[256]; // microseconds
private int _cpuSampleCursor;
private uint _gpuQueryOpaque;
private uint _gpuQueryTransparent;
private readonly long[] _gpuSamples = new long[256]; // microseconds
private int _gpuSampleCursor;
private bool _gpuQueriesInitialized;
- Step 12.2: Initialize GPU queries lazily in Draw()
At the top of Draw() (after _shader.Use() but before bool diag = ...), add:
if (diag && !_gpuQueriesInitialized)
{
_gpuQueryOpaque = _gl.GenQuery();
_gpuQueryTransparent = _gl.GenQuery();
_gpuQueriesInitialized = true;
}
- Step 12.3: Wrap the draw passes with timing
Replace if (diag) _cpuStopwatch.Restart(); semantics — use a top-of-method _cpuStopwatch.Restart(); (always on, cheap) and only LOG under diag.
At the very top of Draw() (just inside the method):
_cpuStopwatch.Restart();
Wrap the opaque pass MultiDrawElementsIndirect call:
if (diag) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryOpaque);
_gl.MultiDrawElementsIndirect(...); // existing call
if (diag) _gl.EndQuery(QueryTarget.TimeElapsed);
Same for transparent pass with _gpuQueryTransparent.
At the bottom of Draw() (after _gl.BindVertexArray(0)):
_cpuStopwatch.Stop();
if (diag)
{
long cpuUs = _cpuStopwatch.ElapsedTicks * 1_000_000L / System.Diagnostics.Stopwatch.Frequency;
_cpuSamples[_cpuSampleCursor] = cpuUs;
_cpuSampleCursor = (_cpuSampleCursor + 1) % _cpuSamples.Length;
// GPU sample read — non-blocking, may not be ready yet on first frames
int avail = 0;
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResultAvailable, out avail);
if (avail != 0)
{
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResult, out long opaqueNs);
_gl.GetQueryObject(_gpuQueryTransparent, QueryObjectParameterName.QueryResult, out long transNs);
long gpuUs = (opaqueNs + transNs) / 1000;
_gpuSamples[_gpuSampleCursor] = gpuUs;
_gpuSampleCursor = (_gpuSampleCursor + 1) % _gpuSamples.Length;
}
}
- Step 12.4: Update MaybeFlushDiag to log timing percentiles
Replace the existing MaybeFlushDiag body:
private void MaybeFlushDiag()
{
long now = Environment.TickCount64;
if (now - _lastLogTick > 5000)
{
long cpuMed = MedianMicros(_cpuSamples);
long cpuP95 = Percentile95Micros(_cpuSamples);
long gpuMed = MedianMicros(_gpuSamples);
long gpuP95 = Percentile95Micros(_gpuSamples);
Console.WriteLine(
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count} " +
$"cpu_us={cpuMed}m/{cpuP95}p95 gpu_us={gpuMed}m/{gpuP95}p95");
_entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0;
_lastLogTick = now;
}
}
private static long MedianMicros(long[] samples)
{
var copy = (long[])samples.Clone();
Array.Sort(copy);
int nz = 0;
foreach (var v in copy) if (v > 0) { nz++; }
if (nz == 0) return 0;
return copy[copy.Length - nz / 2];
}
private static long Percentile95Micros(long[] samples)
{
var copy = (long[])samples.Clone();
Array.Sort(copy);
int nz = 0;
foreach (var v in copy) if (v > 0) { nz++; }
if (nz == 0) return 0;
int idx = copy.Length - 1 - (int)(nz * 0.05);
return copy[idx];
}
- Step 12.5: Update Dispose
Add to Dispose():
if (_gpuQueriesInitialized)
{
_gl.DeleteQuery(_gpuQueryOpaque);
_gl.DeleteQuery(_gpuQueryTransparent);
}
- Step 12.6: Build + smoke test
Run: dotnet build
Expected: PASS.
Smoke launch with ACDREAM_WB_DIAG=1. Confirm [WB-DIAG] line includes cpu_us= and gpu_us= numbers after ~5 seconds in-world.
- Step 12.7: Commit
phase(N.5) Task 12: CPU stopwatch + GL_TIME_ELAPSED queries in [WB-DIAG]
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
queries (opaque + transparent), polled non-blocking on next frame.
Numbers populate the SHIP commit message (Task 20).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 13: Capture before/after perf numbers (USER GATE)
Files:
-
(none — measurement task)
-
Step 13.1: Capture N.5 numbers in Holtburg courtyard
Launch acdream with ACDREAM_WB_DIAG=1. Position character at Holtburg courtyard, 30m elevated, looking SW. Stand still for ~30 seconds. Read the [WB-DIAG] line. Record:
N.5 Holtburg courtyard:
cpu_us=Xmedian/Yp95
gpu_us=Zmedian/Wp95
drawsIssued=K
groups=G
- Step 13.2: Capture N.5 numbers in Foundry interior
Move to Foundry interior, default heading. Same 30s. Record same metrics.
- Step 13.3: Compare against N.4 baseline
Stash N.5 changes:
git stash
git checkout c445364 # N.4 SHIP
dotnet build
Repeat measurements with N.4 active. Record numbers in the same format. Compare:
| Scene | N.4 cpu med | N.5 cpu med | Δ% | N.4 gpu med | N.5 gpu med | Δ% | N.4 draws | N.5 draws |
|---|---|---|---|---|---|---|---|---|
| Holtburg courtyard | ||||||||
| Foundry interior |
Restore N.5:
git checkout claude/priceless-feistel-c12935
git stash pop
- Step 13.4: Verify acceptance gates
Acceptance per spec §8.3:
- CPU dispatcher time ≤ 70% of N.4 in Holtburg courtyard (target: ≥30% reduction).
- GPU rendering time within ±10% of N.4 (sanity).
drawsIssued ≤ 5 per pass.
If gates fail: investigate. Common causes:
- Per-frame
glBufferDatais the bottleneck → defer to N.6 persistent-mapping (per Decision 7). - SSBO indexing slower than expected on driver → check NVidia / AMD / Intel separately.
- Group bucketing not sharing groups well →
groupscount dominatesdrawsIssued.
Save the table to a file: docs/plans/2026-05-08-phase-n5-perf-baseline.md. This goes in the SHIP commit.
- Step 13.5: Commit perf baseline
git add docs/plans/2026-05-08-phase-n5-perf-baseline.md
git commit -m "phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry
[heredoc body]"
Heredoc body:
phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry
Captures CPU + GPU + draw-count numbers for the SHIP gate.
Acceptance gates:
- CPU dispatcher time ≤ 70% of N.4: [PASS / FAIL]
- GPU rendering time within ±10% of N.4: [PASS / FAIL]
- drawsIssued ≤ 5 per pass: [PASS / FAIL]
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 14: Visual verification at Holtburg + Foundry + magic content (USER GATE)
Files:
-
(none — verification task; only commits if regressions found)
-
Step 14.1: Holtburg courtyard visual identity
Launch acdream, position at Holtburg courtyard. Compare side-by-side against N.4 (use git stash + checkout flow from Task 13 if needed). Confirm:
-
All scenery (trees, fences, rocks, buildings) renders correctly.
-
No missing entities.
-
No z-fighting introduced.
-
No exploded character parts.
-
Step 14.2: Foundry interior visual identity
Move to Foundry. Confirm same checklist. Pay attention to dense static-object scenes.
- Step 14.3: Indoor → outdoor transition
Walk through portal/door from outdoors to indoors and back. Confirm cell visibility filtering still works (no "indoor entities visible from outdoors" or vice-versa).
- Step 14.4: Drudge / character close-up
Find a drudge or NPC. Walk close. Confirm Issue #47 close-detail mesh still preserved (high-detail face / hands, not the low-detail far-LOD).
- Step 14.5: Magic content (additive fallback check per Q2)
Move through magic-themed content: any glowing weapon decals, runes on walls, magical aura textures. Compare against N.4. If anything appears "darker" or "less luminous" → that's the Decision 2 additive regression.
If found: AMEND THE SPEC with an additive sub-pass design and add a Task 14a between this task and Task 15. Do NOT proceed to ship without resolving.
- Step 14.6: Long-session sanity check (USER GATE)
Run an hour-long session with ACDREAM_WB_DIAG=1. Watch the [WB-DIAG] resident handle count grow (you'll need to add a bindlessHandlesCount field to the diag log — small task; if not done, just monitor process VRAM via Task Manager / similar). Expected: bounded plateau under 5K handles.
If unbounded growth: file an N.6 follow-up issue, don't block the ship.
- Step 14.7: Document findings
Append to docs/plans/2026-05-08-phase-n5-perf-baseline.md:
## Visual verification (Task 14)
- Holtburg courtyard: PASS / FAIL (note specific issues)
- Foundry interior: PASS / FAIL
- Cell transitions: PASS / FAIL
- Character close-up (Issue #47): PASS / FAIL
- Magic content (additive check): PASS / FAIL
- Long-session sanity: PASS / FAIL — peak resident handles ~N
- Step 14.8: Commit findings (no code change)
phase(N.5) Task 14: visual verification — all gates pass
[Or if any failed: amend with sub-task to address.]
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 15: Delete legacy mesh_instanced shader files
Files:
- Delete:
src/AcDream.App/Rendering/Shaders/mesh_instanced.vert - Delete:
src/AcDream.App/Rendering/Shaders/mesh_instanced.frag - Modify:
src/AcDream.App/Rendering/GameWindow.cs(remove fallback path)
This task removes the fallback shader path. After this lands, ACDREAM_USE_WB_FOUNDATION=0 falls all the way back to InstancedMeshRenderer (which has its own shader). The intermediate "WB foundation on but bindless missing" state no longer exists — if bindless is missing, we treat it as foundation-off.
- Step 15.1: Delete shader files
git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.vert
git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.frag
- Step 15.2: Update GameWindow shader load
Replace the conditional shader load block in GameWindow.cs with the single modern path:
if (_bindlessSupport is not null)
{
_meshShader = new Shader(_gl,
Path.Combine(shadersDir, "mesh_modern.vert"),
Path.Combine(shadersDir, "mesh_modern.frag"));
Console.WriteLine("[N.5] mesh_modern shader loaded");
}
else
{
// Bindless missing — log and skip WbDrawDispatcher construction so
// InstancedMeshRenderer handles all rendering (same effect as
// ACDREAM_USE_WB_FOUNDATION=0).
Console.WriteLine("[N.5] bindless extension missing — falling back to InstancedMeshRenderer");
// _meshShader stays unloaded; InstancedMeshRenderer owns its own shader path.
// The `_dispatcher = new WbDrawDispatcher(...)` site below must be wrapped:
// _dispatcher = (_bindlessSupport is not null) ? new WbDrawDispatcher(...) : null;
// and the per-frame draw call must guard `_dispatcher?.Draw(...)`.
}
Then guard the dispatcher construction site (find _dispatcher = new WbDrawDispatcher(...) in the same file):
_dispatcher = (_bindlessSupport is not null)
? new WbDrawDispatcher(_gl, _meshShader, _textureCache, _meshAdapter, _entitySpawnAdapter, _bindlessSupport)
: null;
And the per-frame call site:
_dispatcher?.Draw(camera, landblockEntries, frustum, ...);
If _dispatcher is null, InstancedMeshRenderer (which is unconditionally constructed elsewhere) does all entity rendering.
- Step 15.3: Build + tests
Run: dotnet build
Expected: PASS.
Run: dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"
Expected: PASS.
- Step 15.4: Smoke test (legacy fallback path)
Test the legacy fallback by running with foundation off:
$env:ACDREAM_USE_WB_FOUNDATION = "0"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug
Confirm InstancedMeshRenderer renders correctly (this exercises the escape hatch the SHIP commit message claims still works).
- Step 15.5: Commit
phase(N.5) Task 15: delete legacy mesh_instanced shader files
mesh_instanced.vert + .frag deleted. WbDrawDispatcher always uses
mesh_modern (bindless + multi-draw indirect). Legacy escape hatch
runs via InstancedMeshRenderer + ACDREAM_USE_WB_FOUNDATION=0 — its
own shader path, untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 16: Update CLAUDE.md WB integration cribs
Files:
-
Modify:
CLAUDE.md -
Step 16.1: Read existing WB integration cribs section
Read CLAUDE.md lines 28-80 (the "WB integration cribs" section).
- Step 16.2: Add N.5 patterns
Append to the WB integration cribs section after the existing bullets:
- **N.5 modern dispatch** uses bindless textures + multi-draw indirect.
`WbDrawDispatcher.Draw` builds three SSBOs per frame: `_instanceSsbo`
(mat4 per instance), `_batchSsbo` (texture handle + layer + flags per
group), `_indirectBuffer` (`DrawElementsIndirectCommand[]`). Two
`glMultiDrawElementsIndirect` calls per frame — opaque, transparent.
See `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
- **`TextureCache` requires `BindlessSupport`** for the WB modern path.
Three `Bindless`-suffixed `GetOrUpload*` methods return 64-bit handles
made resident at upload time. Old `uint`-returning methods stay for
Sky / Terrain / Debug renderers.
- **Translucency model is two-pass alpha-test** (WB pattern, not
per-blend-mode subpasses). Opaque pass discards `α<0.95`, transparent
pass discards `α≥0.95`. Native `Additive` blend renders as alpha-blend
on GfxObj surfaces — falsifiable; if a regression shows up on magic
content, add a third indirect call with `glBlendFunc(SrcAlpha, One)`.
- **Per-instance highlight (selection blink) is reserved.** `InstanceData`
has a documented hook for `vec4 highlightColor` — Phase B.4 follow-up
adds the field + plumbs server-side selection state. Stride grows from
64 → 80 bytes when added; shader updates trivially.
- Step 16.3: Build (sanity — markdown only, but ensures no other docs broke)
Run: dotnet build
Expected: PASS.
- Step 16.4: Commit
phase(N.5) Task 16: extend CLAUDE.md WB cribs with N.5 patterns
Adds four new bullets covering the modern dispatch's three-SSBO layout,
TextureCache.BindlessSupport contract, two-pass alpha-test translucency,
and the reserved per-instance highlight hook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 17: Update memory + roadmap
Files:
- Create:
memory/project_phase_n5_state.md(under user's~/.claude/projects/.../memory/) - Modify:
MEMORY.md(under user's~/.claude/projects/.../memory/) - Modify:
docs/plans/2026-04-11-roadmap.md
Memory files live under C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\ per the auto memory system prompt section.
- Step 17.1: Create memory entry for N.5 state
Create C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\project_phase_n5_state.md:
---
name: Project: Phase N.5 state (shipped 2026-05-XX)
description: N.5 lifted WbDrawDispatcher onto bindless + multi-draw indirect. CPU dispatcher time dropped to ~30-40% of N.4. Three new gotchas captured.
type: project
---
**Phase N.5 — Modern Rendering Path — shipped 2026-05-XX.**
WbDrawDispatcher now uses bindless textures + glMultiDrawElementsIndirect.
Per-frame: 3 SSBO uploads + 2 indirect calls (opaque + transparent). All
textures are 1-layer Texture2DArray; sampler2DArray in shader.
Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
Spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
**Why:** N.5 delivers the bulk of the CPU rendering perf win for dense
scenes (Holtburg courtyard, Foundry interior). N.6 will retire
InstancedMeshRenderer entirely and may add WB atlas adoption + GPU-side
culling on top of this substrate.
**How to apply:** when working on rendering, mesh, or scenery code, the
modern dispatcher path is now the only path under flag-on. Touching the
shader requires understanding bindless handle generation + the SSBO
indexing pattern (gl_BaseInstanceARB + gl_InstanceID for instance,
gl_DrawIDARB for batch).
## Three gotchas surfaced during N.5 implementation
[FILL IN AT SHIP TIME — common candidates:]
1. SSBO upload size off-by-one if you forget instance-stride alignment.
2. `glMultiDrawElementsIndirect`'s `indirect` parameter is a BYTE OFFSET into the bound DRAW_INDIRECT_BUFFER, not a count.
3. Bindless handle 0 is a valid-but-non-resident sentinel — guard for it before populating BatchData.
- Step 17.2: Add MEMORY.md index entry
Edit C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\MEMORY.md. Add immediately after the existing N.4 line:
- [Project: Phase N.5 state](project_phase_n5_state.md) — **N.5 SHIPPED 2026-05-XX.** WbDrawDispatcher on bindless + multi-draw indirect. CPU dispatcher ~30-40% of N.4. Three driver-touching gotchas captured.
- Step 17.3: Update roadmap
Edit docs/plans/2026-04-11-roadmap.md. Move N.5 from "Currently in flight" to the "Shipped" table. Add N.6 as the new "in flight" or "next" entry per the user's preferred sequencing.
- Step 17.4: Commit memory + roadmap
git add docs/plans/2026-04-11-roadmap.md
git commit -m "phase(N.5): roadmap — N.5 shipped, N.6 next
[heredoc body]"
(Memory files are git-ignored — they live under ~/.claude/... and are not committed.)
Heredoc body:
phase(N.5): roadmap — N.5 shipped, N.6 next
Moves N.5 from in-flight to Shipped. Records the perf wins from
Task 13's measurement table. N.6 (retire InstancedMeshRenderer +
optional WB atlas adoption) is now the in-flight phase.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Task 18: Plan finalization — append SHIP section
Files:
-
Modify:
docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md(this file) -
Step 18.1: Add SHIP section at the end of this plan
Append to this plan file (docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md):
---
## SHIP record
**Shipped: 2026-05-XX** at commit [SHIP commit SHA].
**Acceptance gates:**
- [✓] Visual identity to N.4 — confirmed at Holtburg courtyard, Foundry interior, indoor↔outdoor transitions, drudge close-up, magic content.
- [✓] CPU dispatcher time ≤ 70% of N.4 — measured: N.4=Xµs / N.5=Yµs (Z% reduction).
- [✓] GPU rendering time within ±10% of N.4 — measured: N.4=Aµs / N.5=Bµs.
- [✓] `drawsIssued ≤ 5 per pass` — measured: N opaque + M transparent per frame.
- [✓] All tests green — 60+ N.4 tests + 7 new N.5 tests.
- [✓] `ACDREAM_USE_WB_FOUNDATION=0` still works — InstancedMeshRenderer fallback verified.
**Adjustments captured during execution:** [list any spec amendments — e.g., additive sub-pass added if Task 14.5 found regressions].
**Out-of-scope follow-ups (per spec §10):**
- N.6: retire `InstancedMeshRenderer`.
- N.6 candidate: persistent-mapped buffers if `glBufferData` shows up in profiling.
- N.6 candidate: WB atlas adoption for memory savings on shared content.
- Phase B.4 follow-up: per-instance `highlightColor` for selection blink.
- (Long-session memory pressure — log evidence in N.6 watchlist.)
- Step 18.2: Commit
git add docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
git commit -m "phase(N.5): plan finalization — SHIP record appended
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
Task 19: SHIP commit
Files:
-
(no code change — single empty commit OR amend the perf baseline commit's message)
-
Step 19.1: Verify clean tree + green build/test
git status
dotnet build
dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"
Expected: clean tree, build PASS, all tests PASS.
- Step 19.2: Create SHIP commit
git commit --allow-empty -m "phase(N.5): SHIP — modern rendering path on N.4 dispatcher
[heredoc body]"
Heredoc body:
phase(N.5): SHIP — modern rendering path on N.4 dispatcher
Bindless textures + glMultiDrawElementsIndirect. Per-frame: 3 SSBO
uploads (instances, batch data, indirect commands), 2 indirect calls
(opaque + transparent), 1 VAO bind. Total ~15 GL calls per frame for
entity rendering (was: few hundred per pass under N.4).
Acceptance gates (from spec §8.3):
- Visual identity to N.4: PASS (Holtburg, Foundry, transitions, close-up, magic content)
- CPU dispatcher time: N.4=[Xµs] → N.5=[Yµs] ([Z]% reduction; gate ≥30%)
- GPU rendering time: within ±10% of N.4 — PASS
- drawsIssued ≤ 5 per pass: PASS
- All tests green: PASS (67+ tests)
- Legacy fallback (ACDREAM_USE_WB_FOUNDATION=0): PASS
Plan archived at docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Step 19.3: Confirm commit
git log --oneline -5
Expected: top commit is "phase(N.5): SHIP — ...".
Self-review checklist
After all tasks complete, verify against the spec:
-
Spec §2 Decision 1 (sampler2DArray): TextureCache uploads as Texture2DArray (Task 2). Shader samples via
sampler2DArray(Task 5). ✓ -
Spec §2 Decision 2 (two-pass alpha-test): Shader uses
uRenderPassdiscard (Task 5). Dispatcher runs two passes (Task 10). Translucency partition test (Task 11). ✓ -
Spec §2 Decision 3 (SSBO):
_instanceSsbo+_batchSsboat bindings 0+1 (Tasks 7+10). Shader reads viagl_BaseInstanceARB+gl_DrawIDARB(Task 5). ✓ -
Spec §2 Decision 4 (resident on upload):
MakeResidentHandle(Task 3) + Dispose order (Task 4). ✓ -
Spec §2 Decision 5 (two-way flag): Capability check + fallback in GameWindow (Task 6+15). ✓
-
Spec §2 Decision 6 (CPU stopwatch + GL queries): Task 12. Numbers in SHIP message (Task 19). ✓
-
Spec §2 Decision 7 (defer persistent-mapped): No persistent-mapped code in this plan. ✓
-
Spec §2 Decision 8 (defer highlight): InstanceData comment reserves field (Task 5). ✓
-
Spec §4.1 TextureCache changes: Tasks 2-4. ✓
-
Spec §4.2 WbDrawDispatcher changes: Tasks 7-10. ✓
-
Spec §4.3 New shader files: Task 5. ✓
-
Spec §6 Translucency detail: Tasks 10-11. ✓
-
Spec §7 Error handling: Task 6 (capability + compile fallback) + Task 4 (disposal order). ✓
-
Spec §8 Testing: Task 9 (indirect builder), Task 11 (translucency), Task 13 (perf), Task 14 (visual). ✓
-
Spec §9 Risks: Capability check + fallback paths in Tasks 6+15. ✓
No placeholders. No "implement later" tasks. Every step has either code or an exact command.
End of plan.