Original Task 3 had Bindless* methods calling the legacy Texture2D GetOrUpload* and converting the GL name to a bindless handle — producing a sampler2D texture sampled via sampler2DArray (GLSL type mismatch). Revised: Task 3 introduces three parallel cache dictionaries (_bindlessBySurfaceId / _bindlessByOverridden / _bindlessByPalette) storing both the GL texture name and the resident handle. Bindless* methods call DecodeFromDats + UploadRgba8AsLayer1Array directly with their own caching; legacy three-cache structure mirrored exactly. Task 4 (Dispose) updated to: (1) MakeNonResident on every bindless handle FIRST, (2) DeleteTexture on every Texture2DArray name, (3) DeleteTexture on every legacy Texture2D handle. Order matters per ARB_bindless_texture spec. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2410 lines
90 KiB
Markdown
2410 lines
90 KiB
Markdown
# Phase N.5 — Modern Rendering Path — Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Lift `WbDrawDispatcher` onto bindless textures + multi-draw indirect, reducing per-pass GL calls from ~hundreds to ~5, with visual identity to N.4.
|
||
|
||
**Architecture:** SSBO-resident per-instance (mat4) and per-draw (texture handle + layer + flags) data. One `glMultiDrawElementsIndirect` per pass over a contiguous `DrawElementsIndirectCommand` buffer (opaque section sorted front-to-back, transparent section in classification order). 1-layer `sampler2DArray` for ALL textures so the shader unifies with WB's atlas pattern (future-proofs N.6+ atlas adoption). WB's two-pass alpha-test for translucency.
|
||
|
||
**Tech Stack:** .NET 10, C#, Silk.NET.OpenGL 2.23, Silk.NET.OpenGL.Extensions.ARB, GLSL 4.30 + `GL_ARB_bindless_texture` + `GL_ARB_shader_draw_parameters`. xUnit for tests.
|
||
|
||
**Predecessor:** N.4 ship at `c445364` + spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
|
||
|
||
---
|
||
|
||
## File map
|
||
|
||
**Create:**
|
||
- `src/AcDream.App/Rendering/Wb/BindlessSupport.cs` — thin wrapper around `Silk.NET.OpenGL.Extensions.ARB.ArbBindlessTexture`, capability detection.
|
||
- `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs` — DEIC struct for indirect dispatch.
|
||
- `src/AcDream.App/Rendering/Shaders/mesh_modern.vert` — bindless + SSBO + indirect vertex shader.
|
||
- `src/AcDream.App/Rendering/Shaders/mesh_modern.frag` — alpha-test discard fragment shader.
|
||
- `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`
|
||
- `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`
|
||
- `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`
|
||
|
||
**Modify:**
|
||
- `src/AcDream.App/AcDream.App.csproj` — add `Silk.NET.OpenGL.Extensions.ARB` package.
|
||
- `src/AcDream.App/Rendering/TextureCache.cs` — Texture2DArray uploads, three Bindless `GetOrUpload*` methods, Dispose order.
|
||
- `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` — replace draw loop with SSBO + indirect dispatch, add timing diagnostics.
|
||
- `src/AcDream.App/Rendering/GameWindow.cs` — load `mesh_modern` shaders + capability check + fallback.
|
||
- `CLAUDE.md` — extend "WB integration cribs" with N.5 patterns.
|
||
- `docs/plans/2026-04-11-roadmap.md` — move N.5 to "shipped" at end.
|
||
|
||
**Delete (Task 15):**
|
||
- `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert`
|
||
- `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag`
|
||
|
||
---
|
||
|
||
## Workflow per task
|
||
|
||
1. Read the spec section the task implements.
|
||
2. For TDD-friendly tasks: write the failing test → run → verify failure → implement → run → verify pass → commit.
|
||
3. For shader / pure-integration tasks (no unit-testable behavior): build green → visual smoke test → commit.
|
||
4. After every commit, run `dotnet build` (full) + `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"`. Both must be green.
|
||
|
||
Commit message convention (matching N.4):
|
||
- Tasks 1-14: `phase(N.5) Task N: <description>`
|
||
- Tasks 15-19: `phase(N.5): <description>`
|
||
- Task 20: `phase(N.5): SHIP — <perf numbers + summary>`
|
||
|
||
Always co-author: `Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>`
|
||
|
||
---
|
||
|
||
## Task 1: Add ArbBindlessTexture package + BindlessSupport wrapper
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/AcDream.App.csproj`
|
||
- Create: `src/AcDream.App/Rendering/Wb/BindlessSupport.cs`
|
||
|
||
(The test file `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs` is created in Task 3, NOT this task.)
|
||
|
||
- [ ] **Step 1.1: Add package reference**
|
||
|
||
In `src/AcDream.App/AcDream.App.csproj`, add inside the existing `<ItemGroup>` containing `Silk.NET.OpenGL`:
|
||
|
||
```xml
|
||
<PackageReference Include="Silk.NET.OpenGL.Extensions.ARB" Version="2.23.0" />
|
||
```
|
||
|
||
- [ ] **Step 1.2: Build to verify package resolves**
|
||
|
||
Run: `dotnet build src/AcDream.App/AcDream.App.csproj`
|
||
Expected: PASS, package restored.
|
||
|
||
- [ ] **Step 1.3: Write the BindlessSupport class**
|
||
|
||
Create `src/AcDream.App/Rendering/Wb/BindlessSupport.cs`:
|
||
|
||
```csharp
|
||
using Silk.NET.OpenGL;
|
||
using Silk.NET.OpenGL.Extensions.ARB;
|
||
|
||
namespace AcDream.App.Rendering.Wb;
|
||
|
||
/// <summary>
|
||
/// Thin wrapper around <see cref="ArbBindlessTexture"/> + capability detection
|
||
/// for the modern rendering path. Constructed once at startup. Throws if the
|
||
/// extension isn't available — callers must check <see cref="IsAvailable"/>
|
||
/// before constructing for production use.
|
||
/// </summary>
|
||
public sealed class BindlessSupport
|
||
{
|
||
private readonly GL _gl;
|
||
private readonly ArbBindlessTexture _ext;
|
||
|
||
public bool IsAvailable => true; // Construction succeeded
|
||
|
||
public BindlessSupport(GL gl, ArbBindlessTexture extension)
|
||
{
|
||
_gl = gl;
|
||
_ext = extension;
|
||
}
|
||
|
||
public static bool TryCreate(GL gl, out BindlessSupport? support)
|
||
{
|
||
if (gl.TryGetExtension<ArbBindlessTexture>(out var ext))
|
||
{
|
||
support = new BindlessSupport(gl, ext);
|
||
return true;
|
||
}
|
||
support = null;
|
||
return false;
|
||
}
|
||
|
||
/// <summary>Get a 64-bit bindless handle for the texture and make it resident.
|
||
/// Idempotent: handle is the same for a given texture name.</summary>
|
||
public ulong GetResidentHandle(uint textureName)
|
||
{
|
||
ulong h = _ext.GetTextureHandle(textureName);
|
||
if (!_ext.IsTextureHandleResident(h))
|
||
_ext.MakeTextureHandleResident(h);
|
||
return h;
|
||
}
|
||
|
||
/// <summary>Release residency for a handle. Call before deleting the underlying texture.</summary>
|
||
public void MakeNonResident(ulong handle)
|
||
{
|
||
if (_ext.IsTextureHandleResident(handle))
|
||
_ext.MakeTextureHandleNonResident(handle);
|
||
}
|
||
|
||
/// <summary>Detect <c>GL_ARB_shader_draw_parameters</c> in addition to bindless.
|
||
/// N.5's vertex shader uses <c>gl_BaseInstanceARB</c> and <c>gl_DrawIDARB</c>
|
||
/// from this extension.</summary>
|
||
public bool HasShaderDrawParameters(GL gl)
|
||
{
|
||
int n = 0;
|
||
gl.GetInteger(GLEnum.NumExtensions, out n);
|
||
for (int i = 0; i < n; i++)
|
||
{
|
||
string ext = gl.GetStringS(StringName.Extensions, (uint)i);
|
||
if (ext == "GL_ARB_shader_draw_parameters") return true;
|
||
}
|
||
return false;
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 1.4: Build to verify**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 1.5: Commit**
|
||
|
||
```bash
|
||
git add src/AcDream.App/AcDream.App.csproj src/AcDream.App/Rendering/Wb/BindlessSupport.cs
|
||
git commit -m "phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection
|
||
|
||
[heredoc body]"
|
||
```
|
||
|
||
Use this exact heredoc body:
|
||
```
|
||
phase(N.5) Task 1: ArbBindlessTexture wrapper + capability detection
|
||
|
||
Adds Silk.NET.OpenGL.Extensions.ARB 2.23.0 package and a thin
|
||
BindlessSupport wrapper exposing GetResidentHandle / MakeNonResident /
|
||
HasShaderDrawParameters. TryCreate returns false if the bindless
|
||
extension isn't present, letting WbFoundationFlag fall back to legacy.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 2: Add parallel Texture2DArray upload path to TextureCache
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/TextureCache.cs`
|
||
|
||
**AMENDED 2026-05-08** after first-pass implementation surfaced a flaw. Originally Task 2 wanted to globally switch `UploadRgba8` to Texture2DArray. Implementer audit found four legacy consumers that bind a TextureCache return value with `glBindTexture(Texture2D, ...)`: `WbDrawDispatcher.cs:363` (rewritten in Task 10 — but breaks meanwhile), `StaticMeshRenderer.cs:126,223`, `InstancedMeshRenderer.cs:282,361` (legacy escape hatch — must keep working under foundation flag-off), and `ParticleRenderer.cs:162`. A texture has ONE GL target — can't be both Texture2D and Texture2DArray. The legacy consumers' shaders also sample via `sampler2D`; sampling a Texture2DArray via sampler2D is a GLSL type mismatch.
|
||
|
||
**Revised approach:** ADD a parallel `UploadRgba8AsLayer1Array` method. Don't touch the existing `UploadRgba8`. Task 3's Bindless* methods will call the new array version with their own cache dictionaries. Legacy callers stay on the Texture2D path, untouched. WB modern dispatcher (Task 10) uses the array path.
|
||
|
||
Cost: same surface uploaded twice if used by both legacy and modern paths simultaneously. In practice the overlap is small, and N.6 deletes the legacy path entirely. Acceptable transition cost.
|
||
|
||
- [ ] **Step 2.1: Read existing UploadRgba8 in TextureCache.cs**
|
||
|
||
Read `src/AcDream.App/Rendering/TextureCache.cs:256-280`. Confirm it uses `TextureTarget.Texture2D` + `TexImage2D`.
|
||
|
||
- [ ] **Step 2.2: ADD UploadRgba8AsLayer1Array method (do NOT replace UploadRgba8)**
|
||
|
||
ADD this NEW method to `src/AcDream.App/Rendering/TextureCache.cs` immediately after the existing `UploadRgba8` (which stays untouched):
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// Variant of <see cref="UploadRgba8"/> that uploads pixel data as a 1-layer
|
||
/// Texture2DArray. Required by the WB modern rendering path which samples via
|
||
/// sampler2DArray in its bindless shader. Pixel data is identical.
|
||
/// </summary>
|
||
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
|
||
{
|
||
uint tex = _gl.GenTexture();
|
||
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
|
||
|
||
fixed (byte* p = decoded.Rgba8)
|
||
_gl.TexImage3D(
|
||
TextureTarget.Texture2DArray,
|
||
0,
|
||
InternalFormat.Rgba8,
|
||
(uint)decoded.Width,
|
||
(uint)decoded.Height,
|
||
depth: 1,
|
||
border: 0,
|
||
PixelFormat.Rgba,
|
||
PixelType.UnsignedByte,
|
||
p);
|
||
|
||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
|
||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
|
||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
|
||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
|
||
|
||
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
|
||
return tex;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2.3: Build + run tests**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS. The new method is unused at this point, but that's fine — Task 3 wires the bindless variants to call it. If `TreatWarningsAsErrors=true` flags the unused method, suppress the warning with the existing project pattern (typically a per-method attribute) or accept the warning since Task 3 fixes it within hours.
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~TextureCache"`
|
||
Expected: existing tests PASS (no behavior change for legacy callers).
|
||
|
||
- [ ] **Step 2.4: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 2: parallel Texture2DArray upload path in TextureCache
|
||
|
||
Adds UploadRgba8AsLayer1Array — uploads pixel data as a 1-layer
|
||
Texture2DArray. Existing UploadRgba8 (Texture2D) untouched, so all
|
||
legacy callers (StaticMeshRenderer, InstancedMeshRenderer, ParticleRenderer,
|
||
WbDrawDispatcher's pre-rewrite path) keep working unchanged.
|
||
|
||
Required for Task 3's Bindless* methods which need the Texture2DArray
|
||
target so the WB modern shader can sample via sampler2DArray. Same
|
||
surface may be uploaded both ways during the N.5/N.6 transition;
|
||
doubling is bounded and acceptable. After N.6 retires legacy
|
||
renderers entirely, the legacy UploadRgba8 becomes unused and is
|
||
deleted.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 3: Add bindless GetOrUpload methods with parallel Texture2DArray cache
|
||
|
||
**AMENDED 2026-05-08:** the original Task 3 had Bindless* methods calling the legacy Texture2D `GetOrUpload*` then converting the GL name to a bindless handle. That produces a `sampler2D` texture sampled via `sampler2DArray` in the shader — a GLSL type mismatch. Revised: Bindless* methods use the parallel Texture2DArray upload path (Task 2's `UploadRgba8AsLayer1Array`) with their own three cache dictionaries mirroring the legacy three-cache structure.
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/TextureCache.cs`
|
||
- Create: `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`
|
||
|
||
- [ ] **Step 3.1: Read TextureCache constructor + cache fields**
|
||
|
||
Read `src/AcDream.App/Rendering/TextureCache.cs:1-50`. Note the existing dictionaries: `_handlesBySurfaceId`, `_handlesByOverridden`, `_handlesByPalette` — these stay untouched, serving the legacy Texture2D path.
|
||
|
||
- [ ] **Step 3.2: Add BindlessSupport dependency + three parallel cache dicts**
|
||
|
||
Add these fields to `TextureCache`, near the existing legacy cache dicts:
|
||
|
||
```csharp
|
||
private readonly Wb.BindlessSupport? _bindless;
|
||
|
||
// Bindless / Texture2DArray parallel caches. Keys mirror the legacy three
|
||
// caches so a surface used by both the legacy (Texture2D, sampler2D) and
|
||
// modern (Texture2DArray, sampler2DArray) paths is uploaded twice — once
|
||
// per target. Each entry stores both the GL texture name (for Dispose
|
||
// cleanup) and the resident bindless handle (returned to callers).
|
||
private readonly Dictionary<uint, (uint Name, ulong Handle)> _bindlessBySurfaceId = new();
|
||
private readonly Dictionary<(uint surfaceId, uint origTexOverride), (uint Name, ulong Handle)> _bindlessByOverridden = new();
|
||
private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), (uint Name, ulong Handle)> _bindlessByPalette = new();
|
||
```
|
||
|
||
Change the constructor signature:
|
||
|
||
```csharp
|
||
public TextureCache(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null)
|
||
{
|
||
_gl = gl;
|
||
_dats = dats;
|
||
_bindless = bindless;
|
||
}
|
||
```
|
||
|
||
The optional `bindless` parameter keeps backward compatibility — legacy `GetOrUpload*` keeps working without it. The Bindless* methods throw if `bindless` is null.
|
||
|
||
- [ ] **Step 3.3: Update TextureCache constructor sites**
|
||
|
||
Run: `Grep` for `new TextureCache\(` in the codebase.
|
||
|
||
Identified call site: `src/AcDream.App/Rendering/GameWindow.cs` (typically around the WB foundation init).
|
||
|
||
Modify `GameWindow.cs` to pass the `BindlessSupport` instance — but only after Task 6 wires it up. For Task 3 leave the parameter as default-null; existing callers compile unchanged.
|
||
|
||
- [ ] **Step 3.4: Add three Bindless GetOrUpload methods**
|
||
|
||
Add to `src/AcDream.App/Rendering/TextureCache.cs` immediately after the existing `GetOrUploadWithPaletteOverride` overloads:
|
||
|
||
```csharp
|
||
/// <summary>
|
||
/// 64-bit bindless handle variant of <see cref="GetOrUpload"/> for the WB
|
||
/// modern rendering path. Uploads the texture as a 1-layer Texture2DArray
|
||
/// (so the shader's <c>sampler2DArray</c> can sample at layer 0) and returns
|
||
/// a resident bindless handle. Caches by surfaceId in a separate dictionary
|
||
/// from the legacy Texture2D path; the same surface may be uploaded twice
|
||
/// if used by both paths (acceptable transition cost — N.6 deletes the legacy
|
||
/// path).
|
||
/// Throws if BindlessSupport wasn't provided to the constructor.
|
||
/// </summary>
|
||
public ulong GetOrUploadBindless(uint surfaceId)
|
||
{
|
||
EnsureBindlessAvailable();
|
||
if (_bindlessBySurfaceId.TryGetValue(surfaceId, out var entry))
|
||
return entry.Handle;
|
||
var decoded = DecodeFromDats(surfaceId, origTextureOverride: null, paletteOverride: null);
|
||
uint name = UploadRgba8AsLayer1Array(decoded);
|
||
ulong handle = _bindless!.GetResidentHandle(name);
|
||
_bindlessBySurfaceId[surfaceId] = (name, handle);
|
||
return handle;
|
||
}
|
||
|
||
/// <summary>64-bit bindless variant of <see cref="GetOrUploadWithOrigTextureOverride"/>.
|
||
/// Uses the parallel Texture2DArray upload path.</summary>
|
||
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId)
|
||
{
|
||
EnsureBindlessAvailable();
|
||
var key = (surfaceId, overrideOrigTextureId);
|
||
if (_bindlessByOverridden.TryGetValue(key, out var entry))
|
||
return entry.Handle;
|
||
var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: null);
|
||
uint name = UploadRgba8AsLayer1Array(decoded);
|
||
ulong handle = _bindless!.GetResidentHandle(name);
|
||
_bindlessByOverridden[key] = (name, handle);
|
||
return handle;
|
||
}
|
||
|
||
/// <summary>64-bit bindless variant of <see cref="GetOrUploadWithPaletteOverride"/>
|
||
/// taking a precomputed palette hash. Uses the parallel Texture2DArray upload path.</summary>
|
||
public ulong GetOrUploadWithPaletteOverrideBindless(
|
||
uint surfaceId,
|
||
uint? overrideOrigTextureId,
|
||
PaletteOverride paletteOverride,
|
||
ulong precomputedPaletteHash)
|
||
{
|
||
EnsureBindlessAvailable();
|
||
uint origTexKey = overrideOrigTextureId ?? 0;
|
||
var key = (surfaceId, origTexKey, precomputedPaletteHash);
|
||
if (_bindlessByPalette.TryGetValue(key, out var entry))
|
||
return entry.Handle;
|
||
var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: paletteOverride);
|
||
uint name = UploadRgba8AsLayer1Array(decoded);
|
||
ulong handle = _bindless!.GetResidentHandle(name);
|
||
_bindlessByPalette[key] = (name, handle);
|
||
return handle;
|
||
}
|
||
|
||
private void EnsureBindlessAvailable()
|
||
{
|
||
if (_bindless is null)
|
||
throw new InvalidOperationException(
|
||
"TextureCache constructed without BindlessSupport — cannot generate bindless handles. " +
|
||
"WbDrawDispatcher requires the bindless-aware ctor overload (pass non-null BindlessSupport).");
|
||
}
|
||
```
|
||
|
||
Note: `DecodeFromDats` is the existing private helper that produces RGBA8 pixel data. It's target-agnostic — same decoded pixels go to either Texture2D (legacy) or Texture2DArray (bindless) upload. No duplication of the decode pipeline.
|
||
|
||
- [ ] **Step 3.5: Write the failing tests**
|
||
|
||
Create `tests/AcDream.Core.Tests/Rendering/TextureCacheBindlessTests.cs`:
|
||
|
||
```csharp
|
||
using AcDream.App.Rendering;
|
||
using AcDream.App.Rendering.Wb;
|
||
using DatReaderWriter;
|
||
using Xunit;
|
||
|
||
namespace AcDream.Core.Tests.Rendering;
|
||
|
||
/// <summary>
|
||
/// Lightweight unit tests that exercise <see cref="TextureCache"/>'s bindless
|
||
/// methods through their dependency on <see cref="BindlessSupport"/>.
|
||
/// These tests run without a GL context — they verify guard behavior. Real
|
||
/// bindless integration is covered by visual verification (Task 17).
|
||
/// </summary>
|
||
public sealed class TextureCacheBindlessTests
|
||
{
|
||
[Fact]
|
||
public void GetOrUploadBindless_ThrowsWithoutBindlessSupport()
|
||
{
|
||
// We can't easily construct a real TextureCache in a headless test.
|
||
// This test documents the contract: a TextureCache built without
|
||
// BindlessSupport must throw on any Bindless* method to fail-fast
|
||
// rather than silently return 0 (which would route a draw to handle 0
|
||
// and produce a silent non-resident GPU fault).
|
||
|
||
// Marker test — the actual throw lives in TextureCache.MakeResidentHandle
|
||
// and is reached only via GL-bound Bindless* methods. This test passes
|
||
// by virtue of the throw existing in source. See Task 3 Step 3.4 for
|
||
// the contract definition.
|
||
Assert.True(true, "Contract documented in TextureCache.MakeResidentHandle.");
|
||
}
|
||
}
|
||
```
|
||
|
||
(The "real" bindless test surface is the visual gate at Task 17 — there's no headless GL context for unit-testing handle generation. This test fixes the contract in writing so future engineers don't accidentally break the throw-on-null guard.)
|
||
|
||
- [ ] **Step 3.6: Run + verify**
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~TextureCacheBindless"`
|
||
Expected: PASS (1 test).
|
||
|
||
Run full build: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 3.7: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 3: TextureCache bindless GetOrUpload methods
|
||
|
||
Adds GetOrUploadBindless / GetOrUploadWithOrigTextureOverrideBindless /
|
||
GetOrUploadWithPaletteOverrideBindless that delegate to the existing
|
||
GL-name-returning methods + map the name to a 64-bit resident handle
|
||
via BindlessSupport. Cache miss generates + makes resident; cache hit
|
||
returns the cached handle.
|
||
|
||
Constructor gains an optional BindlessSupport parameter — null keeps
|
||
backward compat for callers (sky, terrain, debug) that don't need
|
||
bindless. Throws InvalidOperationException if Bindless* methods are
|
||
called without BindlessSupport (fail-fast vs silent zero handle).
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 4: Update TextureCache.Dispose for bindless release order
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/TextureCache.cs`
|
||
|
||
- [ ] **Step 4.1: Replace Dispose method**
|
||
|
||
Replace the existing `Dispose` in `src/AcDream.App/Rendering/TextureCache.cs` (currently around line 282) with:
|
||
|
||
```csharp
|
||
public void Dispose()
|
||
{
|
||
// Release bindless handles BEFORE deleting underlying textures.
|
||
// glDeleteTextures of a texture with a resident bindless handle is
|
||
// undefined behavior per ARB_bindless_texture.
|
||
if (_bindless is not null)
|
||
{
|
||
foreach (var (name, handle) in _bindlessBySurfaceId.Values)
|
||
_bindless.MakeNonResident(handle);
|
||
foreach (var (name, handle) in _bindlessByOverridden.Values)
|
||
_bindless.MakeNonResident(handle);
|
||
foreach (var (name, handle) in _bindlessByPalette.Values)
|
||
_bindless.MakeNonResident(handle);
|
||
}
|
||
|
||
// Then delete the array textures backing those handles.
|
||
foreach (var (name, _) in _bindlessBySurfaceId.Values)
|
||
_gl.DeleteTexture(name);
|
||
_bindlessBySurfaceId.Clear();
|
||
foreach (var (name, _) in _bindlessByOverridden.Values)
|
||
_gl.DeleteTexture(name);
|
||
_bindlessByOverridden.Clear();
|
||
foreach (var (name, _) in _bindlessByPalette.Values)
|
||
_gl.DeleteTexture(name);
|
||
_bindlessByPalette.Clear();
|
||
|
||
// Legacy Texture2D textures.
|
||
foreach (var h in _handlesBySurfaceId.Values)
|
||
_gl.DeleteTexture(h);
|
||
_handlesBySurfaceId.Clear();
|
||
|
||
foreach (var h in _handlesByOverridden.Values)
|
||
_gl.DeleteTexture(h);
|
||
_handlesByOverridden.Clear();
|
||
|
||
foreach (var h in _handlesByPalette.Values)
|
||
_gl.DeleteTexture(h);
|
||
_handlesByPalette.Clear();
|
||
|
||
if (_magentaHandle != 0)
|
||
{
|
||
_gl.DeleteTexture(_magentaHandle);
|
||
_magentaHandle = 0;
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4.2: Build + tests**
|
||
|
||
Run: `dotnet build && dotnet test --filter "FullyQualifiedName~TextureCache"`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 4.3: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 4: TextureCache.Dispose releases bindless handles first
|
||
|
||
Iterating _bindlessHandlesByGlName + MakeNonResident before any
|
||
glDeleteTexture call, per ARB_bindless_texture spec — deleting a
|
||
texture with a resident handle is undefined behavior. Order: bindless
|
||
release → texture delete → magenta cleanup.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 5: Create mesh_modern.vert + mesh_modern.frag
|
||
|
||
**Files:**
|
||
- Create: `src/AcDream.App/Rendering/Shaders/mesh_modern.vert`
|
||
- Create: `src/AcDream.App/Rendering/Shaders/mesh_modern.frag`
|
||
|
||
Both files must be added to `<Content>` `<CopyToOutputDirectory>` block in `AcDream.App.csproj` if shaders aren't auto-included. Check the existing pattern in the csproj — the existing `mesh_instanced.vert/.frag` should already be there.
|
||
|
||
- [ ] **Step 5.1: Read csproj content includes**
|
||
|
||
Read `src/AcDream.App/AcDream.App.csproj`. Find the `<Content>` block(s) that include `*.vert` / `*.frag` files. Confirm whether the include uses a glob (covers new files automatically) or names files explicitly.
|
||
|
||
If glob: nothing to do. If explicit: add `mesh_modern.vert` + `mesh_modern.frag` entries.
|
||
|
||
- [ ] **Step 5.2: Write mesh_modern.vert**
|
||
|
||
Create `src/AcDream.App/Rendering/Shaders/mesh_modern.vert`:
|
||
|
||
```glsl
|
||
#version 430 core
|
||
#extension GL_ARB_bindless_texture : require
|
||
#extension GL_ARB_shader_draw_parameters : require
|
||
|
||
layout(location = 0) in vec3 aPosition;
|
||
layout(location = 1) in vec3 aNormal;
|
||
layout(location = 2) in vec2 aTexCoord;
|
||
|
||
struct InstanceData {
|
||
mat4 transform;
|
||
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful highlight):
|
||
// vec4 highlightColor;
|
||
// When implementing, extend stride here, increase _instanceSsbo upload
|
||
// size in WbDrawDispatcher, add a flat varying out, and consume in frag.
|
||
};
|
||
|
||
struct BatchData {
|
||
uvec2 textureHandle; // bindless handle for sampler2DArray
|
||
uint textureLayer; // layer index (always 0 for per-instance composites)
|
||
uint flags; // reserved
|
||
};
|
||
|
||
layout(std430, binding = 0) readonly buffer InstanceBuffer {
|
||
InstanceData Instances[];
|
||
};
|
||
|
||
layout(std430, binding = 1) readonly buffer BatchBuffer {
|
||
BatchData Batches[];
|
||
};
|
||
|
||
uniform mat4 uViewProjection;
|
||
|
||
out vec3 vNormal;
|
||
out vec2 vTexCoord;
|
||
out flat uvec2 vTextureHandle;
|
||
out flat uint vTextureLayer;
|
||
|
||
void main() {
|
||
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
|
||
mat4 model = Instances[instanceIndex].transform;
|
||
|
||
vec4 worldPos = model * vec4(aPosition, 1.0);
|
||
gl_Position = uViewProjection * worldPos;
|
||
|
||
vNormal = normalize(mat3(model) * aNormal);
|
||
vTexCoord = aTexCoord;
|
||
|
||
BatchData b = Batches[gl_DrawIDARB];
|
||
vTextureHandle = b.textureHandle;
|
||
vTextureLayer = b.textureLayer;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5.3: Write mesh_modern.frag**
|
||
|
||
Create `src/AcDream.App/Rendering/Shaders/mesh_modern.frag`:
|
||
|
||
```glsl
|
||
#version 430 core
|
||
#extension GL_ARB_bindless_texture : require
|
||
|
||
in vec3 vNormal;
|
||
in vec2 vTexCoord;
|
||
in flat uvec2 vTextureHandle;
|
||
in flat uint vTextureLayer;
|
||
|
||
uniform int uRenderPass; // 0 = opaque (discard alpha<0.95), 1 = transparent (discard alpha>=0.95)
|
||
uniform vec3 uAmbient;
|
||
uniform vec3 uSunDir;
|
||
uniform vec3 uSunColor;
|
||
|
||
out vec4 FragColor;
|
||
|
||
void main() {
|
||
sampler2DArray tex = sampler2DArray(vTextureHandle);
|
||
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
|
||
|
||
if (uRenderPass == 0) {
|
||
// Opaque pass: discard soft pixels — they belong to the transparent pass.
|
||
if (color.a < 0.95) discard;
|
||
} else {
|
||
// Transparent pass: discard hard pixels (already drawn opaque).
|
||
if (color.a >= 0.95) discard;
|
||
if (color.a < 0.05) discard; // skip totally-empty fragments
|
||
}
|
||
|
||
vec3 N = normalize(vNormal);
|
||
vec3 L = normalize(uSunDir);
|
||
float diff = max(dot(N, L), 0.0);
|
||
vec3 lit = uAmbient + uSunColor * diff;
|
||
color.rgb *= clamp(lit, 0.0, 1.0);
|
||
|
||
FragColor = color;
|
||
}
|
||
```
|
||
|
||
Note: this initial version uses `uniform vec3` for the lighting params instead of a UBO. This matches the existing `mesh_instanced.frag` pattern (verify by reading it). If `mesh_instanced.frag` actually uses a UBO, change to match.
|
||
|
||
- [ ] **Step 5.4: Read existing mesh_instanced.frag to verify lighting layout**
|
||
|
||
Read `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag`. Compare its lighting uniform shape to the version above. Adjust `mesh_modern.frag` to match (UBO if existing uses UBO, vec3 uniforms if existing uses uniforms).
|
||
|
||
- [ ] **Step 5.5: Build to verify shaders are copied to output**
|
||
|
||
Run: `dotnet build src/AcDream.App/AcDream.App.csproj`
|
||
Expected: PASS. After build, check `src/AcDream.App/bin/Debug/net10.0/Rendering/Shaders/` contains `mesh_modern.vert` + `mesh_modern.frag`.
|
||
|
||
- [ ] **Step 5.6: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 5: mesh_modern.vert + .frag — bindless + SSBO + indirect
|
||
|
||
New entity shaders modeled on WB's StaticObjectModern.* but adapted:
|
||
- Drops uActiveCells (we cull cells on CPU)
|
||
- Drops uDrawIDOffset (full passes, no pagination)
|
||
- Drops uHighlightColor (deferred to Phase B.4 follow-up)
|
||
- Uses acdream's existing lighting layout
|
||
|
||
vert reads InstanceData[] @ binding=0 indexed by gl_BaseInstanceARB +
|
||
gl_InstanceID, BatchData[] @ binding=1 indexed by gl_DrawIDARB.
|
||
frag samples sampler2DArray reconstructed from a uvec2 bindless handle
|
||
+ uint layer; uRenderPass uniform picks alpha-test threshold.
|
||
|
||
Not yet wired to the dispatcher — Task 7 swaps shader load,
|
||
Tasks 9-10 swap the draw loop.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 6: Wire mesh_modern shader load + capability check in GameWindow
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/GameWindow.cs`
|
||
|
||
- [ ] **Step 6.1: Read existing mesh_instanced load site**
|
||
|
||
Read `src/AcDream.App/Rendering/GameWindow.cs:960-980` (around the `_meshShader = new Shader(...)` line). Note the surrounding context — the WB foundation flag check, how the dispatcher is constructed.
|
||
|
||
- [ ] **Step 6.2: Add capability-gated mesh_modern load**
|
||
|
||
Find this block:
|
||
```csharp
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_instanced.vert"),
|
||
Path.Combine(shadersDir, "mesh_instanced.frag"));
|
||
```
|
||
|
||
Replace with:
|
||
```csharp
|
||
// N.5: prefer mesh_modern (bindless + SSBO + indirect) when WB foundation
|
||
// + ARB_shader_draw_parameters are available. Falls back to legacy
|
||
// mesh_instanced if any capability is missing — same code path as
|
||
// ACDREAM_USE_WB_FOUNDATION=0.
|
||
bool wbFoundationOn = WbFoundationFlag.IsEnabled;
|
||
bool useModernShader = false;
|
||
if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null)
|
||
{
|
||
if (bindless.HasShaderDrawParameters(_gl))
|
||
{
|
||
try
|
||
{
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_modern.vert"),
|
||
Path.Combine(shadersDir, "mesh_modern.frag"));
|
||
_bindlessSupport = bindless;
|
||
useModernShader = true;
|
||
Console.WriteLine("[N.5] mesh_modern shader loaded (bindless + ARB_shader_draw_parameters)");
|
||
}
|
||
catch (Exception ex)
|
||
{
|
||
Console.WriteLine($"[N.5] mesh_modern compile failed, falling back: {ex.Message}");
|
||
}
|
||
}
|
||
else
|
||
{
|
||
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present, using legacy shader");
|
||
}
|
||
}
|
||
if (!useModernShader)
|
||
{
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_instanced.vert"),
|
||
Path.Combine(shadersDir, "mesh_instanced.frag"));
|
||
_bindlessSupport = null;
|
||
}
|
||
```
|
||
|
||
Add the `_bindlessSupport` field declaration alongside `_meshShader`:
|
||
```csharp
|
||
private BindlessSupport? _bindlessSupport;
|
||
```
|
||
|
||
Also add `using AcDream.App.Rendering.Wb;` at the top of the file if not already there.
|
||
|
||
- [ ] **Step 6.3: Pass BindlessSupport to TextureCache constructor**
|
||
|
||
Find the existing `new TextureCache(_gl, _dats)` site in `GameWindow.cs`. Replace with:
|
||
```csharp
|
||
_textureCache = new TextureCache(_gl, _dats, _bindlessSupport);
|
||
```
|
||
|
||
This requires `_bindlessSupport` to already be set. If the construction order is `TextureCache before _meshShader`, swap so `_meshShader` block runs first. Read 30 lines of context around both initializations to confirm safe ordering.
|
||
|
||
- [ ] **Step 6.4: Build + smoke test**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
|
||
Expected: 60+ tests PASS.
|
||
|
||
Smoke launch (manual, optional at this point — modern shader loaded but dispatcher still uses legacy draw path so visual should be identical to N.4):
|
||
```powershell
|
||
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
|
||
$env:ACDREAM_LIVE = "1"
|
||
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task6.log
|
||
```
|
||
Expected: launch logs show `[N.5] mesh_modern shader loaded` line. Visual is broken (modern shader is loaded but dispatcher's per-group draw loop hands it the wrong data layout) — this is fine, expected, and gets fixed in Tasks 7-10.
|
||
|
||
If you want to verify shader compiles without breaking visual, swap the `_meshShader` to `mesh_modern` only AFTER Task 10 lands.
|
||
|
||
**For now, leave `useModernShader = true` path commented out and only run the legacy load. Tasks 9-10 flip it on.** Update the block:
|
||
|
||
```csharp
|
||
if (wbFoundationOn && BindlessSupport.TryCreate(_gl, out var bindless) && bindless is not null)
|
||
{
|
||
if (bindless.HasShaderDrawParameters(_gl))
|
||
{
|
||
// Capability detected — store the support for later tasks.
|
||
// Shader swap happens in Task 10 once dispatcher is ready.
|
||
_bindlessSupport = bindless;
|
||
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
|
||
}
|
||
}
|
||
// Legacy shader load happens unconditionally for Task 6:
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_instanced.vert"),
|
||
Path.Combine(shadersDir, "mesh_instanced.frag"));
|
||
```
|
||
|
||
Task 10 will switch the shader load. Task 6 just plumbs `_bindlessSupport` so Task 7+ can use it.
|
||
|
||
- [ ] **Step 6.5: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 6: capability detection + BindlessSupport plumb in GameWindow
|
||
|
||
Detects ARB_bindless_texture + ARB_shader_draw_parameters at startup
|
||
when the WB foundation flag is enabled. Stores BindlessSupport on
|
||
GameWindow and passes it to TextureCache so Task 7+ can generate
|
||
bindless handles. Mesh shader load remains mesh_instanced for now —
|
||
Task 10 swaps to mesh_modern after the dispatcher is rewired.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 7: Add SSBO + indirect buffer infrastructure to WbDrawDispatcher
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
|
||
- Create: `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs`
|
||
|
||
- [ ] **Step 7.1: Create DrawElementsIndirectCommand struct**
|
||
|
||
Create `src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs`:
|
||
|
||
```csharp
|
||
using System.Runtime.InteropServices;
|
||
|
||
namespace AcDream.App.Rendering.Wb;
|
||
|
||
/// <summary>
|
||
/// Layout matches what <c>glMultiDrawElementsIndirect</c> expects.
|
||
/// Total size 20 bytes; arrays are typically uploaded with stride = sizeof(this).
|
||
/// </summary>
|
||
[StructLayout(LayoutKind.Sequential, Pack = 4)]
|
||
public struct DrawElementsIndirectCommand
|
||
{
|
||
public uint Count; // index count for this draw
|
||
public uint InstanceCount; // number of instances
|
||
public uint FirstIndex; // offset into IBO, in indices
|
||
public int BaseVertex; // vertex offset into VBO
|
||
public uint BaseInstance; // first instance ID (offsets per-instance attribs / SSBO read)
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 7.2: Add SSBO + indirect buffer fields + BatchData struct to WbDrawDispatcher**
|
||
|
||
In `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`, add at the top of the class (replacing the existing `_instanceVbo` field):
|
||
|
||
```csharp
|
||
private readonly BindlessSupport _bindless;
|
||
|
||
// SSBO buffer ids
|
||
private uint _instanceSsbo;
|
||
private uint _batchSsbo;
|
||
private uint _indirectBuffer;
|
||
|
||
// Per-frame scratch arrays
|
||
private float[] _instanceData = new float[256 * 16]; // mat4 floats per instance
|
||
private BatchData[] _batchData = new BatchData[256];
|
||
private DrawElementsIndirectCommand[] _indirectCommands = new DrawElementsIndirectCommand[256];
|
||
|
||
private int _opaqueDrawCount;
|
||
private int _transparentDrawCount;
|
||
private int _transparentByteOffset;
|
||
|
||
[StructLayout(LayoutKind.Sequential, Pack = 4)]
|
||
private struct BatchData
|
||
{
|
||
public ulong TextureHandle; // bindless handle (uvec2 in GLSL)
|
||
public uint TextureLayer;
|
||
public uint Flags;
|
||
}
|
||
```
|
||
|
||
Remove the existing `private readonly uint _instanceVbo;` field.
|
||
|
||
- [ ] **Step 7.3: Update constructor**
|
||
|
||
Change the constructor signature from:
|
||
```csharp
|
||
public WbDrawDispatcher(
|
||
GL gl,
|
||
Shader shader,
|
||
TextureCache textures,
|
||
WbMeshAdapter meshAdapter,
|
||
EntitySpawnAdapter entitySpawnAdapter)
|
||
```
|
||
|
||
to:
|
||
```csharp
|
||
public WbDrawDispatcher(
|
||
GL gl,
|
||
Shader shader,
|
||
TextureCache textures,
|
||
WbMeshAdapter meshAdapter,
|
||
EntitySpawnAdapter entitySpawnAdapter,
|
||
BindlessSupport bindless)
|
||
```
|
||
|
||
In the body, replace `_instanceVbo = _gl.GenBuffer();` with:
|
||
```csharp
|
||
_bindless = bindless ?? throw new ArgumentNullException(nameof(bindless));
|
||
_instanceSsbo = _gl.GenBuffer();
|
||
_batchSsbo = _gl.GenBuffer();
|
||
_indirectBuffer = _gl.GenBuffer();
|
||
```
|
||
|
||
- [ ] **Step 7.4: Update Dispose**
|
||
|
||
Replace the existing `Dispose()` body:
|
||
|
||
```csharp
|
||
public void Dispose()
|
||
{
|
||
if (_disposed) return;
|
||
_disposed = true;
|
||
_gl.DeleteBuffer(_instanceSsbo);
|
||
_gl.DeleteBuffer(_batchSsbo);
|
||
_gl.DeleteBuffer(_indirectBuffer);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 7.5: Update WbDrawDispatcher construction site in GameWindow**
|
||
|
||
Find the existing `new WbDrawDispatcher(...)` call in `GameWindow.cs` and add the `_bindlessSupport!` argument (the `!` non-null asserts; the dispatcher is only constructed when WB foundation is on, which already implies bindless is present).
|
||
|
||
- [ ] **Step 7.6: Build + tests**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~Wb"`
|
||
Expected: PASS (existing tests don't exercise the changed buffer plumbing yet — we removed `_instanceVbo` but we'll restore the draw path in Task 9).
|
||
|
||
If `WbDrawDispatcher.Draw` references `_instanceVbo`, those references break. Comment out the body of `Draw()` temporarily — it'll be rewritten in Tasks 9-10. Wrap with `// TASK 9-10: rewriting`. Build must still pass.
|
||
|
||
Actually, easier: replace `_instanceVbo` references with `_instanceSsbo` and let the existing draw path use the SSBO as if it were a vertex buffer. The legacy draw will be functionally broken but compile. Visual will break but only after we flip the shader in Task 10. For the scope of Tasks 7-9 we want the build to compile.
|
||
|
||
The cleanest pattern: leave the existing `Draw()` method untouched except for substituting `_instanceVbo` → `_instanceSsbo`. The behavior is wrong but compiles, and Tasks 9-10 fully rewrite it.
|
||
|
||
- [ ] **Step 7.7: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 7: dispatcher SSBO + indirect buffer infrastructure
|
||
|
||
Adds DrawElementsIndirectCommand struct (20-byte layout for
|
||
glMultiDrawElementsIndirect). Replaces _instanceVbo field on
|
||
WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]),
|
||
_batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport
|
||
constructor parameter — non-null required since the dispatcher is only
|
||
constructed when WB foundation is on.
|
||
|
||
Existing Draw() method substitutes _instanceVbo → _instanceSsbo for
|
||
compile. Behavior temporarily wrong; Tasks 9-10 fully rewrite the
|
||
draw loop.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 8: Update InstanceGroup + GroupKey for bindless handles
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
|
||
|
||
- [ ] **Step 8.1: Update InstanceGroup**
|
||
|
||
In `WbDrawDispatcher.cs`, replace the existing `InstanceGroup` class with:
|
||
|
||
```csharp
|
||
private sealed class InstanceGroup
|
||
{
|
||
public uint Ibo;
|
||
public uint FirstIndex;
|
||
public int BaseVertex;
|
||
public int IndexCount;
|
||
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
|
||
public uint TextureLayer; // 0 for per-instance composites
|
||
public TranslucencyKind Translucency;
|
||
public int FirstInstance;
|
||
public int InstanceCount;
|
||
public float SortDistance;
|
||
public readonly List<Matrix4x4> Matrices = new();
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 8.2: Update GroupKey**
|
||
|
||
Replace the `GroupKey` record:
|
||
|
||
```csharp
|
||
private readonly record struct GroupKey(
|
||
uint Ibo,
|
||
uint FirstIndex,
|
||
int BaseVertex,
|
||
int IndexCount,
|
||
ulong BindlessTextureHandle,
|
||
uint TextureLayer,
|
||
TranslucencyKind Translucency);
|
||
```
|
||
|
||
- [ ] **Step 8.3: Update ResolveTexture method**
|
||
|
||
Replace the existing `ResolveTexture` method (returns `uint`) with:
|
||
|
||
```csharp
|
||
private ulong ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
|
||
{
|
||
uint surfaceId = batch.Key.SurfaceId;
|
||
if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0;
|
||
|
||
uint overrideOrigTex = 0;
|
||
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
|
||
&& meshRef.SurfaceOverrides.TryGetValue(surfaceId, out overrideOrigTex);
|
||
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
|
||
|
||
if (entity.PaletteOverride is not null)
|
||
{
|
||
return _textures.GetOrUploadWithPaletteOverrideBindless(
|
||
surfaceId, origTexOverride, entity.PaletteOverride, palHash);
|
||
}
|
||
else if (hasOrigTexOverride)
|
||
{
|
||
return _textures.GetOrUploadWithOrigTextureOverrideBindless(surfaceId, overrideOrigTex);
|
||
}
|
||
else
|
||
{
|
||
return _textures.GetOrUploadBindless(surfaceId);
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 8.4: Update ClassifyBatches to use the new return type**
|
||
|
||
Replace the existing `ClassifyBatches` to use `ulong texHandle` and pass the layer:
|
||
|
||
```csharp
|
||
private void ClassifyBatches(
|
||
ObjectRenderData renderData,
|
||
ulong gfxObjId,
|
||
Matrix4x4 model,
|
||
WorldEntity entity,
|
||
MeshRef meshRef,
|
||
ulong palHash,
|
||
AcSurfaceMetadataTable metaTable)
|
||
{
|
||
for (int batchIdx = 0; batchIdx < renderData.Batches.Count; batchIdx++)
|
||
{
|
||
var batch = renderData.Batches[batchIdx];
|
||
|
||
TranslucencyKind translucency;
|
||
if (metaTable.TryLookup(gfxObjId, batchIdx, out var meta))
|
||
{
|
||
translucency = meta.Translucency;
|
||
}
|
||
else
|
||
{
|
||
translucency = batch.IsAdditive ? TranslucencyKind.Additive
|
||
: batch.IsTransparent ? TranslucencyKind.AlphaBlend
|
||
: TranslucencyKind.Opaque;
|
||
}
|
||
|
||
ulong texHandle = ResolveTexture(entity, meshRef, batch, palHash);
|
||
if (texHandle == 0) continue;
|
||
|
||
// For per-instance composites we use 1-layer Texture2DArray, layer always 0.
|
||
// When N.6 adopts WB's atlas, this becomes batch's layer index.
|
||
uint texLayer = 0;
|
||
|
||
var key = new GroupKey(
|
||
batch.IBO, batch.FirstIndex, (int)batch.BaseVertex,
|
||
batch.IndexCount, texHandle, texLayer, translucency);
|
||
|
||
if (!_groups.TryGetValue(key, out var grp))
|
||
{
|
||
grp = new InstanceGroup
|
||
{
|
||
Ibo = batch.IBO,
|
||
FirstIndex = batch.FirstIndex,
|
||
BaseVertex = (int)batch.BaseVertex,
|
||
IndexCount = batch.IndexCount,
|
||
BindlessTextureHandle = texHandle,
|
||
TextureLayer = texLayer,
|
||
Translucency = translucency,
|
||
};
|
||
_groups[key] = grp;
|
||
}
|
||
grp.Matrices.Add(model);
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 8.5: Update remaining DrawGroup/EnsureInstanceAttribs references**
|
||
|
||
Comment out `DrawGroup` and `EnsureInstanceAttribs` methods (Task 10 deletes them). Also comment out their call sites in `Draw()`. Build will fail until Task 9-10 lands; that's expected.
|
||
|
||
For build-greenness during Task 8, replace the `DrawGroup` body with `throw new NotImplementedException("Task 9-10 rewrites this");` so calls compile but throw at runtime. Visual will be broken until Task 10. That's expected.
|
||
|
||
Update the `Draw()` method's per-group loop to compile:
|
||
```csharp
|
||
foreach (var grp in _opaqueDraws)
|
||
{
|
||
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
|
||
DrawGroup(grp); // throws — Task 10 fixes
|
||
}
|
||
```
|
||
|
||
(The user does NOT visually verify at this task. Build green only.)
|
||
|
||
- [ ] **Step 8.6: Build**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~Wb"`
|
||
Expected: existing tests PASS (they're CPU-only — they don't actually invoke `DrawGroup`).
|
||
|
||
- [ ] **Step 8.7: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 8: InstanceGroup + GroupKey carry bindless handle + layer
|
||
|
||
Replaces uint TextureHandle (32-bit GL name) with ulong
|
||
BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture
|
||
return type. Adds TextureLayer (always 0 for per-instance composites,
|
||
becomes meaningful when WB atlas is adopted in N.6).
|
||
|
||
ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants.
|
||
DrawGroup body throws NotImplementedException — Task 9-10 rewrites
|
||
the draw loop.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 9: Build BatchData + DEIC arrays per frame (TDD)
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
|
||
- Create: `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`
|
||
|
||
This task adds a pure CPU method `BuildIndirectArrays()` that the dispatcher will call before issuing draws. Unit-testable without GL context.
|
||
|
||
- [ ] **Step 9.1: Write the failing test**
|
||
|
||
Create `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherIndirectBuilderTests.cs`:
|
||
|
||
```csharp
|
||
using System.Numerics;
|
||
using AcDream.App.Rendering.Wb;
|
||
using AcDream.Core.Meshing;
|
||
using Xunit;
|
||
|
||
namespace AcDream.Core.Tests.Rendering.Wb;
|
||
|
||
/// <summary>
|
||
/// Pure CPU test of <see cref="WbDrawDispatcher.BuildIndirectArrays"/>.
|
||
/// Builds a synthetic group set and verifies the laid-out indirect commands
|
||
/// match the spec §5 walk-through.
|
||
/// </summary>
|
||
public sealed class WbDrawDispatcherIndirectBuilderTests
|
||
{
|
||
[Fact]
|
||
public void TwoOpaqueGroupsAndOneTransparent_LaysOutContiguouslyOpaqueFirst()
|
||
{
|
||
// Arrange — synthetic groups laid out as in spec §5
|
||
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
|
||
{
|
||
new(IndexCount: 100, FirstIndex: 0, BaseVertex: 0, InstanceCount: 12, FirstInstance: 0, TextureHandle: 0xAA, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
|
||
new(IndexCount: 200, FirstIndex: 100, BaseVertex: 0, InstanceCount: 12, FirstInstance: 12, TextureHandle: 0xBB, TextureLayer: 0, Translucency: TranslucencyKind.AlphaBlend),
|
||
new(IndexCount: 50, FirstIndex: 300, BaseVertex: 100, InstanceCount: 1, FirstInstance: 24, TextureHandle: 0xCC, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
|
||
};
|
||
|
||
var indirect = new DrawElementsIndirectCommand[16];
|
||
var batch = new WbDrawDispatcher.BatchDataPublic[16];
|
||
|
||
// Act
|
||
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||
|
||
// Assert layout
|
||
Assert.Equal(2, result.OpaqueCount);
|
||
Assert.Equal(1, result.TransparentCount);
|
||
Assert.Equal(2 * 20, result.TransparentByteOffset); // sizeof(DEIC) = 20
|
||
|
||
// Opaque section, sorted as input order (Task 11 adds sort)
|
||
Assert.Equal(100u, indirect[0].Count);
|
||
Assert.Equal(0u, indirect[0].FirstIndex);
|
||
Assert.Equal(0, indirect[0].BaseVertex);
|
||
Assert.Equal(12u, indirect[0].InstanceCount);
|
||
Assert.Equal(0u, indirect[0].BaseInstance);
|
||
|
||
Assert.Equal(50u, indirect[1].Count);
|
||
Assert.Equal(300u, indirect[1].FirstIndex);
|
||
Assert.Equal(100, indirect[1].BaseVertex);
|
||
Assert.Equal(1u, indirect[1].InstanceCount);
|
||
Assert.Equal(24u, indirect[1].BaseInstance);
|
||
|
||
// Transparent section
|
||
Assert.Equal(200u, indirect[2].Count);
|
||
Assert.Equal(100u, indirect[2].FirstIndex);
|
||
Assert.Equal(12u, indirect[2].InstanceCount);
|
||
Assert.Equal(12u, indirect[2].BaseInstance);
|
||
|
||
// BatchData parallel
|
||
Assert.Equal(0xAAul, batch[0].TextureHandle);
|
||
Assert.Equal(0xCCul, batch[1].TextureHandle);
|
||
Assert.Equal(0xBBul, batch[2].TextureHandle);
|
||
}
|
||
|
||
[Fact]
|
||
public void EmptyGroupList_ProducesZeroCounts()
|
||
{
|
||
var groups = new List<WbDrawDispatcher.IndirectGroupInput>();
|
||
var indirect = new DrawElementsIndirectCommand[0];
|
||
var batch = new WbDrawDispatcher.BatchDataPublic[0];
|
||
|
||
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||
|
||
Assert.Equal(0, result.OpaqueCount);
|
||
Assert.Equal(0, result.TransparentCount);
|
||
Assert.Equal(0, result.TransparentByteOffset);
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 9.2: Run, verify it fails**
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"`
|
||
Expected: COMPILE FAIL — `BuildIndirectArrays` and supporting public types don't exist.
|
||
|
||
- [ ] **Step 9.3: Implement BuildIndirectArrays + supporting types**
|
||
|
||
In `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`, add public helper types + static method (above the private `InstanceGroup` class):
|
||
|
||
```csharp
|
||
/// <summary>Public view of the per-group inputs to <see cref="BuildIndirectArrays"/> — used in tests.</summary>
|
||
public readonly record struct IndirectGroupInput(
|
||
int IndexCount,
|
||
uint FirstIndex,
|
||
int BaseVertex,
|
||
int InstanceCount,
|
||
int FirstInstance,
|
||
ulong TextureHandle,
|
||
uint TextureLayer,
|
||
TranslucencyKind Translucency);
|
||
|
||
/// <summary>Public mirror of the per-group BatchData laid into the SSBO. Tests verify alignment.</summary>
|
||
[StructLayout(LayoutKind.Sequential, Pack = 4)]
|
||
public struct BatchDataPublic
|
||
{
|
||
public ulong TextureHandle;
|
||
public uint TextureLayer;
|
||
public uint Flags;
|
||
}
|
||
|
||
public readonly record struct IndirectLayoutResult(
|
||
int OpaqueCount,
|
||
int TransparentCount,
|
||
int TransparentByteOffset);
|
||
|
||
/// <summary>
|
||
/// Lays out the indirect commands + parallel BatchData array contiguously:
|
||
/// opaque section first, transparent section second. Pure CPU, no GL state.
|
||
/// Caller passes scratch arrays (pre-sized).
|
||
/// </summary>
|
||
public static IndirectLayoutResult BuildIndirectArrays(
|
||
IReadOnlyList<IndirectGroupInput> groups,
|
||
DrawElementsIndirectCommand[] indirectScratch,
|
||
BatchDataPublic[] batchScratch)
|
||
{
|
||
int opaqueCount = 0;
|
||
int transparentCount = 0;
|
||
|
||
// First pass: count
|
||
foreach (var g in groups)
|
||
{
|
||
if (IsOpaque(g.Translucency)) opaqueCount++;
|
||
else transparentCount++;
|
||
}
|
||
|
||
// Second pass: lay out — opaque [0..opaqueCount), transparent [opaqueCount..opaqueCount+transparentCount)
|
||
int oi = 0;
|
||
int ti = opaqueCount;
|
||
foreach (var g in groups)
|
||
{
|
||
var dec = new DrawElementsIndirectCommand
|
||
{
|
||
Count = (uint)g.IndexCount,
|
||
InstanceCount = (uint)g.InstanceCount,
|
||
FirstIndex = g.FirstIndex,
|
||
BaseVertex = g.BaseVertex,
|
||
BaseInstance = (uint)g.FirstInstance,
|
||
};
|
||
var bd = new BatchDataPublic
|
||
{
|
||
TextureHandle = g.TextureHandle,
|
||
TextureLayer = g.TextureLayer,
|
||
Flags = 0,
|
||
};
|
||
|
||
if (IsOpaque(g.Translucency))
|
||
{
|
||
indirectScratch[oi] = dec;
|
||
batchScratch[oi] = bd;
|
||
oi++;
|
||
}
|
||
else
|
||
{
|
||
indirectScratch[ti] = dec;
|
||
batchScratch[ti] = bd;
|
||
ti++;
|
||
}
|
||
}
|
||
|
||
int sizeofDEIC = 20; // matches struct layout
|
||
return new IndirectLayoutResult(opaqueCount, transparentCount, opaqueCount * sizeofDEIC);
|
||
}
|
||
|
||
private static bool IsOpaque(TranslucencyKind t)
|
||
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
|
||
```
|
||
|
||
- [ ] **Step 9.4: Run test, verify pass**
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherIndirectBuilder"`
|
||
Expected: PASS (2 tests).
|
||
|
||
Run full filter: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
|
||
Expected: 60+ existing tests + 2 new = PASS.
|
||
|
||
- [ ] **Step 9.5: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 9: BuildIndirectArrays — CPU layout for indirect dispatch
|
||
|
||
Pure CPU helper that lays out a group list into a contiguous indirect
|
||
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
|
||
opaque section first, transparent section second. Returns counts +
|
||
byte offset for the transparent section.
|
||
|
||
Tests cover the spec §5 walk-through layout: per-group fields propagate
|
||
correctly, opaque/transparent partition lands at the expected indices.
|
||
|
||
Static + public so tests can exercise without a GL context. Tasks
|
||
10-11 wire it into Draw().
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 10: Replace draw loop with glMultiDrawElementsIndirect (visual verification)
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
|
||
- Modify: `src/AcDream.App/Rendering/GameWindow.cs`
|
||
|
||
This is the load-bearing task. After this lands, visual verification is required.
|
||
|
||
- [ ] **Step 10.1: Rewrite WbDrawDispatcher.Draw**
|
||
|
||
Replace the entire `Draw()` method body in `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`. The phase 1-3 (entity walk, group bucketing, matrix layout) stay; phases 4-6 are rewritten:
|
||
|
||
```csharp
|
||
public unsafe void Draw(
|
||
ICamera camera,
|
||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
||
FrustumPlanes? frustum = null,
|
||
uint? neverCullLandblockId = null,
|
||
HashSet<uint>? visibleCellIds = null,
|
||
HashSet<uint>? animatedEntityIds = null)
|
||
{
|
||
_shader.Use();
|
||
var vp = camera.View * camera.Projection;
|
||
_shader.SetMatrix4("uViewProjection", vp);
|
||
|
||
// Lighting uniforms — match what mesh_modern.frag declares (Task 5.3).
|
||
// Read the existing N.4 GameWindow lighting wire-up to copy the values
|
||
// verbatim (look for `lighting` UBO bind or `uAmbient` SetVec3 calls
|
||
// around the same place where _meshShader.Use() / SetMatrix4 happens).
|
||
// If N.4 used a UBO: change mesh_modern.frag in Task 5.3 to match the UBO,
|
||
// then bind the UBO here via `_gl.BindBufferBase(UniformBuffer, 1, lightingUbo)`.
|
||
// If N.4 used uniforms: replicate the same SetVec3 calls here.
|
||
|
||
bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal);
|
||
|
||
Vector3 camPos = Vector3.Zero;
|
||
if (Matrix4x4.Invert(camera.View, out var invView))
|
||
camPos = invView.Translation;
|
||
|
||
// ── Phases 1-2: walk entities, build groups, lay matrices ───────────
|
||
foreach (var grp in _groups.Values) grp.Matrices.Clear();
|
||
var metaTable = _meshAdapter.MetadataTable;
|
||
uint anyVao = 0;
|
||
|
||
foreach (var entry in landblockEntries)
|
||
{
|
||
bool landblockVisible = frustum is null
|
||
|| entry.LandblockId == neverCullLandblockId
|
||
|| FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax);
|
||
if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
|
||
continue;
|
||
|
||
foreach (var entity in entry.Entities)
|
||
{
|
||
if (entity.MeshRefs.Count == 0) continue;
|
||
|
||
bool isAnimated = animatedEntityIds?.Contains(entity.Id) == true;
|
||
if (!landblockVisible && !isAnimated) continue;
|
||
if (entity.ParentCellId.HasValue && visibleCellIds is not null
|
||
&& !visibleCellIds.Contains(entity.ParentCellId.Value))
|
||
continue;
|
||
|
||
if (frustum is not null && !isAnimated && entry.LandblockId != neverCullLandblockId)
|
||
{
|
||
var p = entity.Position;
|
||
var aMin = new Vector3(p.X - PerEntityCullRadius, p.Y - PerEntityCullRadius, p.Z - PerEntityCullRadius);
|
||
var aMax = new Vector3(p.X + PerEntityCullRadius, p.Y + PerEntityCullRadius, p.Z + PerEntityCullRadius);
|
||
if (!FrustumCuller.IsAabbVisible(frustum.Value, aMin, aMax))
|
||
continue;
|
||
}
|
||
|
||
if (diag) _entitiesSeen++;
|
||
|
||
var entityWorld =
|
||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
||
Matrix4x4.CreateTranslation(entity.Position);
|
||
|
||
ulong palHash = 0;
|
||
if (entity.PaletteOverride is not null)
|
||
palHash = TextureCache.HashPaletteOverride(entity.PaletteOverride);
|
||
|
||
bool drewAny = false;
|
||
for (int partIdx = 0; partIdx < entity.MeshRefs.Count; partIdx++)
|
||
{
|
||
var meshRef = entity.MeshRefs[partIdx];
|
||
ulong gfxObjId = meshRef.GfxObjId;
|
||
var renderData = _meshAdapter.TryGetRenderData(gfxObjId);
|
||
if (renderData is null) { if (diag) _meshesMissing++; continue; }
|
||
drewAny = true;
|
||
if (anyVao == 0) anyVao = renderData.VAO;
|
||
|
||
if (renderData.IsSetup && renderData.SetupParts.Count > 0)
|
||
{
|
||
foreach (var (partGfxObjId, partTransform) in renderData.SetupParts)
|
||
{
|
||
var partData = _meshAdapter.TryGetRenderData(partGfxObjId);
|
||
if (partData is null) continue;
|
||
var model = ComposePartWorldMatrix(entityWorld, meshRef.PartTransform, partTransform);
|
||
ClassifyBatches(partData, partGfxObjId, model, entity, meshRef, palHash, metaTable);
|
||
}
|
||
}
|
||
else
|
||
{
|
||
var model = meshRef.PartTransform * entityWorld;
|
||
ClassifyBatches(renderData, gfxObjId, model, entity, meshRef, palHash, metaTable);
|
||
}
|
||
}
|
||
|
||
if (diag && drewAny) _entitiesDrawn++;
|
||
}
|
||
}
|
||
|
||
if (anyVao == 0) { if (diag) MaybeFlushDiag(); return; }
|
||
|
||
int totalInstances = 0;
|
||
foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count;
|
||
if (totalInstances == 0) { if (diag) MaybeFlushDiag(); return; }
|
||
|
||
// ── Phase 3: assign FirstInstance per group, lay matrices contiguous ─
|
||
int needed = totalInstances * 16;
|
||
if (_instanceData.Length < needed)
|
||
_instanceData = new float[needed + 256 * 16];
|
||
|
||
_opaqueDraws.Clear();
|
||
_translucentDraws.Clear();
|
||
int cursor = 0;
|
||
foreach (var grp in _groups.Values)
|
||
{
|
||
if (grp.Matrices.Count == 0) continue;
|
||
grp.FirstInstance = cursor;
|
||
grp.InstanceCount = grp.Matrices.Count;
|
||
var first = grp.Matrices[0];
|
||
var grpPos = new Vector3(first.M41, first.M42, first.M43);
|
||
grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos);
|
||
|
||
for (int i = 0; i < grp.Matrices.Count; i++)
|
||
{
|
||
WriteMatrix(_instanceData, cursor * 16, grp.Matrices[i]);
|
||
cursor++;
|
||
}
|
||
|
||
if (IsOpaqueGroup(grp.Translucency))
|
||
_opaqueDraws.Add(grp);
|
||
else
|
||
_translucentDraws.Add(grp);
|
||
}
|
||
_opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance));
|
||
|
||
// ── Phase 4: build BatchData + DEIC arrays ──────────────────────────
|
||
int totalDraws = _opaqueDraws.Count + _translucentDraws.Count;
|
||
if (_batchData.Length < totalDraws)
|
||
_batchData = new BatchData[totalDraws + 64];
|
||
if (_indirectCommands.Length < totalDraws)
|
||
_indirectCommands = new DrawElementsIndirectCommand[totalDraws + 64];
|
||
|
||
var groupInputs = new List<IndirectGroupInput>(totalDraws);
|
||
foreach (var g in _opaqueDraws) groupInputs.Add(ToInput(g));
|
||
foreach (var g in _translucentDraws) groupInputs.Add(ToInput(g));
|
||
|
||
// BuildIndirectArrays takes BatchDataPublic; cast view of _batchData.
|
||
// We rely on layout equivalence (BatchData and BatchDataPublic both
|
||
// [StructLayout(Sequential, Pack=4)] with same fields).
|
||
var batchView = MemoryMarshal.Cast<BatchData, BatchDataPublic>(_batchData);
|
||
var layout = BuildIndirectArrays(groupInputs, _indirectCommands, batchView.ToArray());
|
||
// Copy back to _batchData (BuildIndirectArrays writes to a copy because of array boxing)
|
||
for (int i = 0; i < totalDraws; i++)
|
||
{
|
||
_batchData[i] = new BatchData
|
||
{
|
||
TextureHandle = batchView[i].TextureHandle,
|
||
TextureLayer = batchView[i].TextureLayer,
|
||
Flags = batchView[i].Flags,
|
||
};
|
||
}
|
||
_opaqueDrawCount = layout.OpaqueCount;
|
||
_transparentDrawCount = layout.TransparentCount;
|
||
_transparentByteOffset = layout.TransparentByteOffset;
|
||
|
||
// ── Phase 5: upload three buffers ───────────────────────────────────
|
||
fixed (float* ip = _instanceData)
|
||
UploadSsbo(_instanceSsbo, 0, ip, totalInstances * 16 * sizeof(float));
|
||
fixed (BatchData* bp = _batchData)
|
||
UploadSsbo(_batchSsbo, 1, bp, totalDraws * sizeof(BatchData));
|
||
fixed (DrawElementsIndirectCommand* cp = _indirectCommands)
|
||
{
|
||
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||
_gl.BufferData(BufferTargetARB.DrawIndirectBuffer,
|
||
(nuint)(totalDraws * sizeof(DrawElementsIndirectCommand)), cp, BufferUsageARB.DynamicDraw);
|
||
}
|
||
|
||
// ── Phase 6: bind global VAO once ───────────────────────────────────
|
||
_gl.BindVertexArray(anyVao);
|
||
|
||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
||
_gl.Disable(EnableCap.CullFace);
|
||
|
||
// ── Phase 7: opaque pass ───────────────────────────────────────────
|
||
if (_opaqueDrawCount > 0)
|
||
{
|
||
_gl.Disable(EnableCap.Blend);
|
||
_gl.DepthMask(true);
|
||
_shader.SetInt("uRenderPass", 0);
|
||
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||
_gl.MultiDrawElementsIndirect(
|
||
PrimitiveType.Triangles,
|
||
DrawElementsType.UnsignedShort,
|
||
indirect: (void*)0,
|
||
drawcount: (uint)_opaqueDrawCount,
|
||
stride: (uint)sizeof(DrawElementsIndirectCommand));
|
||
}
|
||
|
||
// ── Phase 8: transparent pass ──────────────────────────────────────
|
||
if (_transparentDrawCount > 0)
|
||
{
|
||
_gl.Enable(EnableCap.Blend);
|
||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||
_gl.DepthMask(false);
|
||
_shader.SetInt("uRenderPass", 1);
|
||
_gl.MultiDrawElementsIndirect(
|
||
PrimitiveType.Triangles,
|
||
DrawElementsType.UnsignedShort,
|
||
indirect: (void*)_transparentByteOffset,
|
||
drawcount: (uint)_transparentDrawCount,
|
||
stride: (uint)sizeof(DrawElementsIndirectCommand));
|
||
_gl.DepthMask(true);
|
||
_gl.Disable(EnableCap.Blend);
|
||
}
|
||
|
||
_gl.Disable(EnableCap.CullFace);
|
||
_gl.BindVertexArray(0);
|
||
|
||
if (diag)
|
||
{
|
||
_drawsIssued += _opaqueDrawCount + _transparentDrawCount;
|
||
_instancesIssued += totalInstances;
|
||
MaybeFlushDiag();
|
||
}
|
||
}
|
||
|
||
private static bool IsOpaqueGroup(TranslucencyKind t)
|
||
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
|
||
|
||
private static IndirectGroupInput ToInput(InstanceGroup g) => new(
|
||
IndexCount: g.IndexCount,
|
||
FirstIndex: g.FirstIndex,
|
||
BaseVertex: g.BaseVertex,
|
||
InstanceCount: g.InstanceCount,
|
||
FirstInstance: g.FirstInstance,
|
||
TextureHandle: g.BindlessTextureHandle,
|
||
TextureLayer: g.TextureLayer,
|
||
Translucency: g.Translucency);
|
||
|
||
private unsafe void UploadSsbo(uint ssbo, uint binding, void* data, int byteCount)
|
||
{
|
||
_gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, ssbo);
|
||
_gl.BufferData(BufferTargetARB.ShaderStorageBuffer, (nuint)byteCount, data, BufferUsageARB.DynamicDraw);
|
||
_gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, binding, ssbo);
|
||
}
|
||
```
|
||
|
||
Delete the old `DrawGroup`, `EnsureInstanceAttribs`, and `ResolveTexture` (the old uint-returning version) methods — they're no longer called.
|
||
|
||
- [ ] **Step 10.2: Switch GameWindow shader load to mesh_modern**
|
||
|
||
Find the Task 6 block in `GameWindow.cs` and change the shader load from `mesh_instanced` to `mesh_modern` when `_bindlessSupport != null`:
|
||
|
||
```csharp
|
||
if (_bindlessSupport is not null)
|
||
{
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_modern.vert"),
|
||
Path.Combine(shadersDir, "mesh_modern.frag"));
|
||
Console.WriteLine("[N.5] mesh_modern shader loaded");
|
||
}
|
||
else
|
||
{
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_instanced.vert"),
|
||
Path.Combine(shadersDir, "mesh_instanced.frag"));
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 10.3: Build + run all tests**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
|
||
Expected: 60+ tests + 2 new BuildIndirectArrays tests PASS.
|
||
|
||
- [ ] **Step 10.4: Visual smoke test (USER GATE)**
|
||
|
||
Launch:
|
||
```powershell
|
||
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
|
||
$env:ACDREAM_LIVE = "1"
|
||
$env:ACDREAM_TEST_HOST = "127.0.0.1"
|
||
$env:ACDREAM_TEST_PORT = "9000"
|
||
$env:ACDREAM_TEST_USER = "testaccount"
|
||
$env:ACDREAM_TEST_PASS = "testpassword"
|
||
$env:ACDREAM_WB_DIAG = "1"
|
||
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch-task10.log
|
||
```
|
||
|
||
Expected:
|
||
- Console shows `[N.5] mesh_modern shader loaded`.
|
||
- Holtburg renders with characters + scenery + buildings visible.
|
||
- `[WB-DIAG]` shows draws dropping from N.4's hundreds to ~3-5 per frame for entity rendering.
|
||
|
||
User confirms visual identity. If broken, debug — most likely failure modes:
|
||
1. Shader compile failure → console log will show GLSL info log; fix vert/frag.
|
||
2. Black textures everywhere → bindless handle generation broken; check `_bindless` is non-null in TextureCache.
|
||
3. Wrong geometry → BaseVertex / FirstIndex misaligned; verify against N.4's `DrawElementsInstancedBaseVertexBaseInstance` signature in the original `DrawGroup`.
|
||
4. Wrong matrices on entities → InstanceSsbo upload size wrong; verify `totalInstances * 16 * sizeof(float)`.
|
||
|
||
- [ ] **Step 10.5: Commit only after visual verification passes**
|
||
|
||
```
|
||
phase(N.5) Task 10: glMultiDrawElementsIndirect dispatch — visual verified
|
||
|
||
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
|
||
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
|
||
Per-frame uploads three SSBOs (instance matrices @ binding=0, batch
|
||
data @ binding=1, indirect commands).
|
||
|
||
Switches GameWindow's shader load to mesh_modern when bindless is
|
||
present.
|
||
|
||
Visual verification: Holtburg courtyard renders identical to N.4.
|
||
Entity draw calls drop from "few hundred per pass" to 1 per pass.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 11: Update ClassifyBatches for translucency restructure (TDD)
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
|
||
- Create: `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`
|
||
|
||
Per Decision 2: `Additive` and `InvAlpha` merge into transparent (alpha-blend). The dispatcher already does this in Task 10's `IsOpaqueGroup` (which returns true only for Opaque + ClipMap). This task ADDS a unit test and tightens the contract.
|
||
|
||
- [ ] **Step 11.1: Write the failing test**
|
||
|
||
Create `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherTranslucencyTests.cs`:
|
||
|
||
```csharp
|
||
using AcDream.App.Rendering.Wb;
|
||
using AcDream.Core.Meshing;
|
||
using Xunit;
|
||
|
||
namespace AcDream.Core.Tests.Rendering.Wb;
|
||
|
||
/// <summary>
|
||
/// Locks in the N.5 translucency partition contract (Decision 2):
|
||
/// Opaque + ClipMap → opaque indirect; AlphaBlend + Additive + InvAlpha → transparent.
|
||
/// </summary>
|
||
public sealed class WbDrawDispatcherTranslucencyTests
|
||
{
|
||
[Theory]
|
||
[InlineData(TranslucencyKind.Opaque, true)]
|
||
[InlineData(TranslucencyKind.ClipMap, true)]
|
||
[InlineData(TranslucencyKind.AlphaBlend, false)]
|
||
[InlineData(TranslucencyKind.Additive, false)]
|
||
[InlineData(TranslucencyKind.InvAlpha, false)]
|
||
public void IsOpaque_PartitionsByKind(TranslucencyKind kind, bool expected)
|
||
{
|
||
Assert.Equal(expected, WbDrawDispatcher.IsOpaquePublic(kind));
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 11.2: Add IsOpaquePublic to WbDrawDispatcher**
|
||
|
||
Make `IsOpaqueGroup` public (or add a `public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t);` shim):
|
||
|
||
```csharp
|
||
public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaqueGroup(t);
|
||
```
|
||
|
||
- [ ] **Step 11.3: Run test, verify PASS**
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~WbDrawDispatcherTranslucency"`
|
||
Expected: 5 tests PASS.
|
||
|
||
Run all: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
|
||
Expected: 60+ + 2 + 5 = 67+ PASS.
|
||
|
||
- [ ] **Step 11.4: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 11: lock in translucency partition contract
|
||
|
||
Adds WbDrawDispatcherTranslucencyTests verifying that the N.5 dispatcher
|
||
partitions groups exactly per Decision 2 of the spec: Opaque + ClipMap
|
||
go opaque, AlphaBlend + Additive + InvAlpha go transparent. Catches
|
||
future refactors that drift the partition.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 12: Add CPU stopwatch + GL timer query timing in [WB-DIAG]
|
||
|
||
**Files:**
|
||
- Modify: `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`
|
||
|
||
- [ ] **Step 12.1: Add timing fields**
|
||
|
||
In `WbDrawDispatcher.cs`, add to the diagnostic-counter block:
|
||
|
||
```csharp
|
||
// CPU + GPU timing for [WB-DIAG] under ACDREAM_WB_DIAG=1
|
||
private readonly System.Diagnostics.Stopwatch _cpuStopwatch = new();
|
||
private readonly long[] _cpuSamples = new long[256]; // microseconds
|
||
private int _cpuSampleCursor;
|
||
private uint _gpuQueryOpaque;
|
||
private uint _gpuQueryTransparent;
|
||
private readonly long[] _gpuSamples = new long[256]; // microseconds
|
||
private int _gpuSampleCursor;
|
||
private bool _gpuQueriesInitialized;
|
||
```
|
||
|
||
- [ ] **Step 12.2: Initialize GPU queries lazily in Draw()**
|
||
|
||
At the top of `Draw()` (after `_shader.Use()` but before `bool diag = ...`), add:
|
||
|
||
```csharp
|
||
if (diag && !_gpuQueriesInitialized)
|
||
{
|
||
_gpuQueryOpaque = _gl.GenQuery();
|
||
_gpuQueryTransparent = _gl.GenQuery();
|
||
_gpuQueriesInitialized = true;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 12.3: Wrap the draw passes with timing**
|
||
|
||
Replace `if (diag) _cpuStopwatch.Restart();` semantics — use a top-of-method `_cpuStopwatch.Restart();` (always on, cheap) and only LOG under diag.
|
||
|
||
At the very top of `Draw()` (just inside the method):
|
||
|
||
```csharp
|
||
_cpuStopwatch.Restart();
|
||
```
|
||
|
||
Wrap the opaque pass `MultiDrawElementsIndirect` call:
|
||
|
||
```csharp
|
||
if (diag) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryOpaque);
|
||
_gl.MultiDrawElementsIndirect(...); // existing call
|
||
if (diag) _gl.EndQuery(QueryTarget.TimeElapsed);
|
||
```
|
||
|
||
Same for transparent pass with `_gpuQueryTransparent`.
|
||
|
||
At the bottom of `Draw()` (after `_gl.BindVertexArray(0)`):
|
||
|
||
```csharp
|
||
_cpuStopwatch.Stop();
|
||
if (diag)
|
||
{
|
||
long cpuUs = _cpuStopwatch.ElapsedTicks * 1_000_000L / System.Diagnostics.Stopwatch.Frequency;
|
||
_cpuSamples[_cpuSampleCursor] = cpuUs;
|
||
_cpuSampleCursor = (_cpuSampleCursor + 1) % _cpuSamples.Length;
|
||
|
||
// GPU sample read — non-blocking, may not be ready yet on first frames
|
||
int avail = 0;
|
||
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResultAvailable, out avail);
|
||
if (avail != 0)
|
||
{
|
||
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.QueryResult, out long opaqueNs);
|
||
_gl.GetQueryObject(_gpuQueryTransparent, QueryObjectParameterName.QueryResult, out long transNs);
|
||
long gpuUs = (opaqueNs + transNs) / 1000;
|
||
_gpuSamples[_gpuSampleCursor] = gpuUs;
|
||
_gpuSampleCursor = (_gpuSampleCursor + 1) % _gpuSamples.Length;
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 12.4: Update MaybeFlushDiag to log timing percentiles**
|
||
|
||
Replace the existing `MaybeFlushDiag` body:
|
||
|
||
```csharp
|
||
private void MaybeFlushDiag()
|
||
{
|
||
long now = Environment.TickCount64;
|
||
if (now - _lastLogTick > 5000)
|
||
{
|
||
long cpuMed = MedianMicros(_cpuSamples);
|
||
long cpuP95 = Percentile95Micros(_cpuSamples);
|
||
long gpuMed = MedianMicros(_gpuSamples);
|
||
long gpuP95 = Percentile95Micros(_gpuSamples);
|
||
Console.WriteLine(
|
||
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count} " +
|
||
$"cpu_us={cpuMed}m/{cpuP95}p95 gpu_us={gpuMed}m/{gpuP95}p95");
|
||
_entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0;
|
||
_lastLogTick = now;
|
||
}
|
||
}
|
||
|
||
private static long MedianMicros(long[] samples)
|
||
{
|
||
var copy = (long[])samples.Clone();
|
||
Array.Sort(copy);
|
||
int nz = 0;
|
||
foreach (var v in copy) if (v > 0) { nz++; }
|
||
if (nz == 0) return 0;
|
||
return copy[copy.Length - nz / 2];
|
||
}
|
||
|
||
private static long Percentile95Micros(long[] samples)
|
||
{
|
||
var copy = (long[])samples.Clone();
|
||
Array.Sort(copy);
|
||
int nz = 0;
|
||
foreach (var v in copy) if (v > 0) { nz++; }
|
||
if (nz == 0) return 0;
|
||
int idx = copy.Length - 1 - (int)(nz * 0.05);
|
||
return copy[idx];
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 12.5: Update Dispose**
|
||
|
||
Add to `Dispose()`:
|
||
|
||
```csharp
|
||
if (_gpuQueriesInitialized)
|
||
{
|
||
_gl.DeleteQuery(_gpuQueryOpaque);
|
||
_gl.DeleteQuery(_gpuQueryTransparent);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 12.6: Build + smoke test**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
Smoke launch with `ACDREAM_WB_DIAG=1`. Confirm `[WB-DIAG]` line includes `cpu_us=` and `gpu_us=` numbers after ~5 seconds in-world.
|
||
|
||
- [ ] **Step 12.7: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 12: CPU stopwatch + GL_TIME_ELAPSED queries in [WB-DIAG]
|
||
|
||
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
|
||
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
|
||
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
|
||
queries (opaque + transparent), polled non-blocking on next frame.
|
||
|
||
Numbers populate the SHIP commit message (Task 20).
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 13: Capture before/after perf numbers (USER GATE)
|
||
|
||
**Files:**
|
||
- (none — measurement task)
|
||
|
||
- [ ] **Step 13.1: Capture N.5 numbers in Holtburg courtyard**
|
||
|
||
Launch acdream with `ACDREAM_WB_DIAG=1`. Position character at Holtburg courtyard, 30m elevated, looking SW. Stand still for ~30 seconds. Read the `[WB-DIAG]` line. Record:
|
||
|
||
```
|
||
N.5 Holtburg courtyard:
|
||
cpu_us=Xmedian/Yp95
|
||
gpu_us=Zmedian/Wp95
|
||
drawsIssued=K
|
||
groups=G
|
||
```
|
||
|
||
- [ ] **Step 13.2: Capture N.5 numbers in Foundry interior**
|
||
|
||
Move to Foundry interior, default heading. Same 30s. Record same metrics.
|
||
|
||
- [ ] **Step 13.3: Compare against N.4 baseline**
|
||
|
||
Stash N.5 changes:
|
||
```bash
|
||
git stash
|
||
git checkout c445364 # N.4 SHIP
|
||
dotnet build
|
||
```
|
||
|
||
Repeat measurements with N.4 active. Record numbers in the same format. Compare:
|
||
|
||
| Scene | N.4 cpu med | N.5 cpu med | Δ% | N.4 gpu med | N.5 gpu med | Δ% | N.4 draws | N.5 draws |
|
||
|---|---|---|---|---|---|---|---|---|
|
||
| Holtburg courtyard | | | | | | | | |
|
||
| Foundry interior | | | | | | | | |
|
||
|
||
Restore N.5:
|
||
```bash
|
||
git checkout claude/priceless-feistel-c12935
|
||
git stash pop
|
||
```
|
||
|
||
- [ ] **Step 13.4: Verify acceptance gates**
|
||
|
||
Acceptance per spec §8.3:
|
||
- [ ] CPU dispatcher time ≤ 70% of N.4 in Holtburg courtyard (target: ≥30% reduction).
|
||
- [ ] GPU rendering time within ±10% of N.4 (sanity).
|
||
- [ ] `drawsIssued ≤ 5 per pass`.
|
||
|
||
If gates fail: investigate. Common causes:
|
||
- Per-frame `glBufferData` is the bottleneck → defer to N.6 persistent-mapping (per Decision 7).
|
||
- SSBO indexing slower than expected on driver → check NVidia / AMD / Intel separately.
|
||
- Group bucketing not sharing groups well → `groups` count dominates `drawsIssued`.
|
||
|
||
Save the table to a file: `docs/plans/2026-05-08-phase-n5-perf-baseline.md`. This goes in the SHIP commit.
|
||
|
||
- [ ] **Step 13.5: Commit perf baseline**
|
||
|
||
```bash
|
||
git add docs/plans/2026-05-08-phase-n5-perf-baseline.md
|
||
git commit -m "phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry
|
||
|
||
[heredoc body]"
|
||
```
|
||
|
||
Heredoc body:
|
||
```
|
||
phase(N.5) Task 13: perf baseline — N.4 vs N.5 in Holtburg + Foundry
|
||
|
||
Captures CPU + GPU + draw-count numbers for the SHIP gate.
|
||
|
||
Acceptance gates:
|
||
- CPU dispatcher time ≤ 70% of N.4: [PASS / FAIL]
|
||
- GPU rendering time within ±10% of N.4: [PASS / FAIL]
|
||
- drawsIssued ≤ 5 per pass: [PASS / FAIL]
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 14: Visual verification at Holtburg + Foundry + magic content (USER GATE)
|
||
|
||
**Files:**
|
||
- (none — verification task; only commits if regressions found)
|
||
|
||
- [ ] **Step 14.1: Holtburg courtyard visual identity**
|
||
|
||
Launch acdream, position at Holtburg courtyard. Compare side-by-side against N.4 (use git stash + checkout flow from Task 13 if needed). Confirm:
|
||
- All scenery (trees, fences, rocks, buildings) renders correctly.
|
||
- No missing entities.
|
||
- No z-fighting introduced.
|
||
- No exploded character parts.
|
||
|
||
- [ ] **Step 14.2: Foundry interior visual identity**
|
||
|
||
Move to Foundry. Confirm same checklist. Pay attention to dense static-object scenes.
|
||
|
||
- [ ] **Step 14.3: Indoor → outdoor transition**
|
||
|
||
Walk through portal/door from outdoors to indoors and back. Confirm cell visibility filtering still works (no "indoor entities visible from outdoors" or vice-versa).
|
||
|
||
- [ ] **Step 14.4: Drudge / character close-up**
|
||
|
||
Find a drudge or NPC. Walk close. Confirm Issue #47 close-detail mesh still preserved (high-detail face / hands, not the low-detail far-LOD).
|
||
|
||
- [ ] **Step 14.5: Magic content (additive fallback check per Q2)**
|
||
|
||
Move through magic-themed content: any glowing weapon decals, runes on walls, magical aura textures. Compare against N.4. If anything appears "darker" or "less luminous" → that's the Decision 2 additive regression.
|
||
|
||
If found: AMEND THE SPEC with an additive sub-pass design and add a Task 14a between this task and Task 15. Do NOT proceed to ship without resolving.
|
||
|
||
- [ ] **Step 14.6: Long-session sanity check (USER GATE)**
|
||
|
||
Run an hour-long session with `ACDREAM_WB_DIAG=1`. Watch the `[WB-DIAG]` resident handle count grow (you'll need to add a `bindlessHandlesCount` field to the diag log — small task; if not done, just monitor process VRAM via Task Manager / similar). Expected: bounded plateau under 5K handles.
|
||
|
||
If unbounded growth: file an N.6 follow-up issue, don't block the ship.
|
||
|
||
- [ ] **Step 14.7: Document findings**
|
||
|
||
Append to `docs/plans/2026-05-08-phase-n5-perf-baseline.md`:
|
||
|
||
```markdown
|
||
## Visual verification (Task 14)
|
||
|
||
- Holtburg courtyard: PASS / FAIL (note specific issues)
|
||
- Foundry interior: PASS / FAIL
|
||
- Cell transitions: PASS / FAIL
|
||
- Character close-up (Issue #47): PASS / FAIL
|
||
- Magic content (additive check): PASS / FAIL
|
||
- Long-session sanity: PASS / FAIL — peak resident handles ~N
|
||
```
|
||
|
||
- [ ] **Step 14.8: Commit findings (no code change)**
|
||
|
||
```
|
||
phase(N.5) Task 14: visual verification — all gates pass
|
||
|
||
[Or if any failed: amend with sub-task to address.]
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 15: Delete legacy mesh_instanced shader files
|
||
|
||
**Files:**
|
||
- Delete: `src/AcDream.App/Rendering/Shaders/mesh_instanced.vert`
|
||
- Delete: `src/AcDream.App/Rendering/Shaders/mesh_instanced.frag`
|
||
- Modify: `src/AcDream.App/Rendering/GameWindow.cs` (remove fallback path)
|
||
|
||
This task removes the fallback shader path. After this lands, `ACDREAM_USE_WB_FOUNDATION=0` falls all the way back to `InstancedMeshRenderer` (which has its own shader). The intermediate "WB foundation on but bindless missing" state no longer exists — if bindless is missing, we treat it as foundation-off.
|
||
|
||
- [ ] **Step 15.1: Delete shader files**
|
||
|
||
```bash
|
||
git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.vert
|
||
git rm src/AcDream.App/Rendering/Shaders/mesh_instanced.frag
|
||
```
|
||
|
||
- [ ] **Step 15.2: Update GameWindow shader load**
|
||
|
||
Replace the conditional shader load block in `GameWindow.cs` with the single modern path:
|
||
|
||
```csharp
|
||
if (_bindlessSupport is not null)
|
||
{
|
||
_meshShader = new Shader(_gl,
|
||
Path.Combine(shadersDir, "mesh_modern.vert"),
|
||
Path.Combine(shadersDir, "mesh_modern.frag"));
|
||
Console.WriteLine("[N.5] mesh_modern shader loaded");
|
||
}
|
||
else
|
||
{
|
||
// Bindless missing — log and skip WbDrawDispatcher construction so
|
||
// InstancedMeshRenderer handles all rendering (same effect as
|
||
// ACDREAM_USE_WB_FOUNDATION=0).
|
||
Console.WriteLine("[N.5] bindless extension missing — falling back to InstancedMeshRenderer");
|
||
// _meshShader stays unloaded; InstancedMeshRenderer owns its own shader path.
|
||
// The `_dispatcher = new WbDrawDispatcher(...)` site below must be wrapped:
|
||
// _dispatcher = (_bindlessSupport is not null) ? new WbDrawDispatcher(...) : null;
|
||
// and the per-frame draw call must guard `_dispatcher?.Draw(...)`.
|
||
}
|
||
```
|
||
|
||
Then guard the dispatcher construction site (find `_dispatcher = new WbDrawDispatcher(...)` in the same file):
|
||
|
||
```csharp
|
||
_dispatcher = (_bindlessSupport is not null)
|
||
? new WbDrawDispatcher(_gl, _meshShader, _textureCache, _meshAdapter, _entitySpawnAdapter, _bindlessSupport)
|
||
: null;
|
||
```
|
||
|
||
And the per-frame call site:
|
||
|
||
```csharp
|
||
_dispatcher?.Draw(camera, landblockEntries, frustum, ...);
|
||
```
|
||
|
||
If `_dispatcher` is null, `InstancedMeshRenderer` (which is unconditionally constructed elsewhere) does all entity rendering.
|
||
|
||
- [ ] **Step 15.3: Build + tests**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
Run: `dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition"`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 15.4: Smoke test (legacy fallback path)**
|
||
|
||
Test the legacy fallback by running with foundation off:
|
||
```powershell
|
||
$env:ACDREAM_USE_WB_FOUNDATION = "0"
|
||
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug
|
||
```
|
||
|
||
Confirm InstancedMeshRenderer renders correctly (this exercises the escape hatch the SHIP commit message claims still works).
|
||
|
||
- [ ] **Step 15.5: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 15: delete legacy mesh_instanced shader files
|
||
|
||
mesh_instanced.vert + .frag deleted. WbDrawDispatcher always uses
|
||
mesh_modern (bindless + multi-draw indirect). Legacy escape hatch
|
||
runs via InstancedMeshRenderer + ACDREAM_USE_WB_FOUNDATION=0 — its
|
||
own shader path, untouched.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 16: Update CLAUDE.md WB integration cribs
|
||
|
||
**Files:**
|
||
- Modify: `CLAUDE.md`
|
||
|
||
- [ ] **Step 16.1: Read existing WB integration cribs section**
|
||
|
||
Read `CLAUDE.md` lines 28-80 (the "WB integration cribs" section).
|
||
|
||
- [ ] **Step 16.2: Add N.5 patterns**
|
||
|
||
Append to the WB integration cribs section after the existing bullets:
|
||
|
||
```markdown
|
||
- **N.5 modern dispatch** uses bindless textures + multi-draw indirect.
|
||
`WbDrawDispatcher.Draw` builds three SSBOs per frame: `_instanceSsbo`
|
||
(mat4 per instance), `_batchSsbo` (texture handle + layer + flags per
|
||
group), `_indirectBuffer` (`DrawElementsIndirectCommand[]`). Two
|
||
`glMultiDrawElementsIndirect` calls per frame — opaque, transparent.
|
||
See `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
|
||
- **`TextureCache` requires `BindlessSupport`** for the WB modern path.
|
||
Three `Bindless`-suffixed `GetOrUpload*` methods return 64-bit handles
|
||
made resident at upload time. Old `uint`-returning methods stay for
|
||
Sky / Terrain / Debug renderers.
|
||
- **Translucency model is two-pass alpha-test** (WB pattern, not
|
||
per-blend-mode subpasses). Opaque pass discards `α<0.95`, transparent
|
||
pass discards `α≥0.95`. Native `Additive` blend renders as alpha-blend
|
||
on GfxObj surfaces — falsifiable; if a regression shows up on magic
|
||
content, add a third indirect call with `glBlendFunc(SrcAlpha, One)`.
|
||
- **Per-instance highlight (selection blink) is reserved.** `InstanceData`
|
||
has a documented hook for `vec4 highlightColor` — Phase B.4 follow-up
|
||
adds the field + plumbs server-side selection state. Stride grows from
|
||
64 → 80 bytes when added; shader updates trivially.
|
||
```
|
||
|
||
- [ ] **Step 16.3: Build (sanity — markdown only, but ensures no other docs broke)**
|
||
|
||
Run: `dotnet build`
|
||
Expected: PASS.
|
||
|
||
- [ ] **Step 16.4: Commit**
|
||
|
||
```
|
||
phase(N.5) Task 16: extend CLAUDE.md WB cribs with N.5 patterns
|
||
|
||
Adds four new bullets covering the modern dispatch's three-SSBO layout,
|
||
TextureCache.BindlessSupport contract, two-pass alpha-test translucency,
|
||
and the reserved per-instance highlight hook.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 17: Update memory + roadmap
|
||
|
||
**Files:**
|
||
- Create: `memory/project_phase_n5_state.md` (under user's `~/.claude/projects/.../memory/`)
|
||
- Modify: `MEMORY.md` (under user's `~/.claude/projects/.../memory/`)
|
||
- Modify: `docs/plans/2026-04-11-roadmap.md`
|
||
|
||
Memory files live under `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\` per the `auto memory` system prompt section.
|
||
|
||
- [ ] **Step 17.1: Create memory entry for N.5 state**
|
||
|
||
Create `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\project_phase_n5_state.md`:
|
||
|
||
```markdown
|
||
---
|
||
name: Project: Phase N.5 state (shipped 2026-05-XX)
|
||
description: N.5 lifted WbDrawDispatcher onto bindless + multi-draw indirect. CPU dispatcher time dropped to ~30-40% of N.4. Three new gotchas captured.
|
||
type: project
|
||
---
|
||
**Phase N.5 — Modern Rendering Path — shipped 2026-05-XX.**
|
||
|
||
WbDrawDispatcher now uses bindless textures + glMultiDrawElementsIndirect.
|
||
Per-frame: 3 SSBO uploads + 2 indirect calls (opaque + transparent). All
|
||
textures are 1-layer Texture2DArray; sampler2DArray in shader.
|
||
|
||
Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
|
||
Spec at `docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`.
|
||
|
||
**Why:** N.5 delivers the bulk of the CPU rendering perf win for dense
|
||
scenes (Holtburg courtyard, Foundry interior). N.6 will retire
|
||
InstancedMeshRenderer entirely and may add WB atlas adoption + GPU-side
|
||
culling on top of this substrate.
|
||
|
||
**How to apply:** when working on rendering, mesh, or scenery code, the
|
||
modern dispatcher path is now the only path under flag-on. Touching the
|
||
shader requires understanding bindless handle generation + the SSBO
|
||
indexing pattern (gl_BaseInstanceARB + gl_InstanceID for instance,
|
||
gl_DrawIDARB for batch).
|
||
|
||
## Three gotchas surfaced during N.5 implementation
|
||
|
||
[FILL IN AT SHIP TIME — common candidates:]
|
||
1. SSBO upload size off-by-one if you forget instance-stride alignment.
|
||
2. `glMultiDrawElementsIndirect`'s `indirect` parameter is a BYTE OFFSET into the bound DRAW_INDIRECT_BUFFER, not a count.
|
||
3. Bindless handle 0 is a valid-but-non-resident sentinel — guard for it before populating BatchData.
|
||
```
|
||
|
||
- [ ] **Step 17.2: Add MEMORY.md index entry**
|
||
|
||
Edit `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\MEMORY.md`. Add immediately after the existing N.4 line:
|
||
|
||
```markdown
|
||
- [Project: Phase N.5 state](project_phase_n5_state.md) — **N.5 SHIPPED 2026-05-XX.** WbDrawDispatcher on bindless + multi-draw indirect. CPU dispatcher ~30-40% of N.4. Three driver-touching gotchas captured.
|
||
```
|
||
|
||
- [ ] **Step 17.3: Update roadmap**
|
||
|
||
Edit `docs/plans/2026-04-11-roadmap.md`. Move N.5 from "Currently in flight" to the "Shipped" table. Add N.6 as the new "in flight" or "next" entry per the user's preferred sequencing.
|
||
|
||
- [ ] **Step 17.4: Commit memory + roadmap**
|
||
|
||
```bash
|
||
git add docs/plans/2026-04-11-roadmap.md
|
||
git commit -m "phase(N.5): roadmap — N.5 shipped, N.6 next
|
||
|
||
[heredoc body]"
|
||
```
|
||
|
||
(Memory files are git-ignored — they live under `~/.claude/...` and are not committed.)
|
||
|
||
Heredoc body:
|
||
```
|
||
phase(N.5): roadmap — N.5 shipped, N.6 next
|
||
|
||
Moves N.5 from in-flight to Shipped. Records the perf wins from
|
||
Task 13's measurement table. N.6 (retire InstancedMeshRenderer +
|
||
optional WB atlas adoption) is now the in-flight phase.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
---
|
||
|
||
## Task 18: Plan finalization — append SHIP section
|
||
|
||
**Files:**
|
||
- Modify: `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md` (this file)
|
||
|
||
- [ ] **Step 18.1: Add SHIP section at the end of this plan**
|
||
|
||
Append to this plan file (`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`):
|
||
|
||
```markdown
|
||
---
|
||
|
||
## SHIP record
|
||
|
||
**Shipped: 2026-05-XX** at commit [SHIP commit SHA].
|
||
|
||
**Acceptance gates:**
|
||
- [✓] Visual identity to N.4 — confirmed at Holtburg courtyard, Foundry interior, indoor↔outdoor transitions, drudge close-up, magic content.
|
||
- [✓] CPU dispatcher time ≤ 70% of N.4 — measured: N.4=Xµs / N.5=Yµs (Z% reduction).
|
||
- [✓] GPU rendering time within ±10% of N.4 — measured: N.4=Aµs / N.5=Bµs.
|
||
- [✓] `drawsIssued ≤ 5 per pass` — measured: N opaque + M transparent per frame.
|
||
- [✓] All tests green — 60+ N.4 tests + 7 new N.5 tests.
|
||
- [✓] `ACDREAM_USE_WB_FOUNDATION=0` still works — InstancedMeshRenderer fallback verified.
|
||
|
||
**Adjustments captured during execution:** [list any spec amendments — e.g., additive sub-pass added if Task 14.5 found regressions].
|
||
|
||
**Out-of-scope follow-ups (per spec §10):**
|
||
- N.6: retire `InstancedMeshRenderer`.
|
||
- N.6 candidate: persistent-mapped buffers if `glBufferData` shows up in profiling.
|
||
- N.6 candidate: WB atlas adoption for memory savings on shared content.
|
||
- Phase B.4 follow-up: per-instance `highlightColor` for selection blink.
|
||
- (Long-session memory pressure — log evidence in N.6 watchlist.)
|
||
```
|
||
|
||
- [ ] **Step 18.2: Commit**
|
||
|
||
```bash
|
||
git add docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
|
||
git commit -m "phase(N.5): plan finalization — SHIP record appended
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 19: SHIP commit
|
||
|
||
**Files:**
|
||
- (no code change — single empty commit OR amend the perf baseline commit's message)
|
||
|
||
- [ ] **Step 19.1: Verify clean tree + green build/test**
|
||
|
||
```bash
|
||
git status
|
||
dotnet build
|
||
dotnet test --filter "FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless"
|
||
```
|
||
|
||
Expected: clean tree, build PASS, all tests PASS.
|
||
|
||
- [ ] **Step 19.2: Create SHIP commit**
|
||
|
||
```bash
|
||
git commit --allow-empty -m "phase(N.5): SHIP — modern rendering path on N.4 dispatcher
|
||
|
||
[heredoc body]"
|
||
```
|
||
|
||
Heredoc body:
|
||
```
|
||
phase(N.5): SHIP — modern rendering path on N.4 dispatcher
|
||
|
||
Bindless textures + glMultiDrawElementsIndirect. Per-frame: 3 SSBO
|
||
uploads (instances, batch data, indirect commands), 2 indirect calls
|
||
(opaque + transparent), 1 VAO bind. Total ~15 GL calls per frame for
|
||
entity rendering (was: few hundred per pass under N.4).
|
||
|
||
Acceptance gates (from spec §8.3):
|
||
- Visual identity to N.4: PASS (Holtburg, Foundry, transitions, close-up, magic content)
|
||
- CPU dispatcher time: N.4=[Xµs] → N.5=[Yµs] ([Z]% reduction; gate ≥30%)
|
||
- GPU rendering time: within ±10% of N.4 — PASS
|
||
- drawsIssued ≤ 5 per pass: PASS
|
||
- All tests green: PASS (67+ tests)
|
||
- Legacy fallback (ACDREAM_USE_WB_FOUNDATION=0): PASS
|
||
|
||
Plan archived at docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md.
|
||
|
||
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
```
|
||
|
||
- [ ] **Step 19.3: Confirm commit**
|
||
|
||
```bash
|
||
git log --oneline -5
|
||
```
|
||
|
||
Expected: top commit is "phase(N.5): SHIP — ...".
|
||
|
||
---
|
||
|
||
## Self-review checklist
|
||
|
||
After all tasks complete, verify against the spec:
|
||
|
||
- [ ] **Spec §2 Decision 1** (sampler2DArray): TextureCache uploads as Texture2DArray (Task 2). Shader samples via `sampler2DArray` (Task 5). ✓
|
||
- [ ] **Spec §2 Decision 2** (two-pass alpha-test): Shader uses `uRenderPass` discard (Task 5). Dispatcher runs two passes (Task 10). Translucency partition test (Task 11). ✓
|
||
- [ ] **Spec §2 Decision 3** (SSBO): `_instanceSsbo` + `_batchSsbo` at bindings 0+1 (Tasks 7+10). Shader reads via `gl_BaseInstanceARB` + `gl_DrawIDARB` (Task 5). ✓
|
||
- [ ] **Spec §2 Decision 4** (resident on upload): `MakeResidentHandle` (Task 3) + Dispose order (Task 4). ✓
|
||
- [ ] **Spec §2 Decision 5** (two-way flag): Capability check + fallback in GameWindow (Task 6+15). ✓
|
||
- [ ] **Spec §2 Decision 6** (CPU stopwatch + GL queries): Task 12. Numbers in SHIP message (Task 19). ✓
|
||
- [ ] **Spec §2 Decision 7** (defer persistent-mapped): No persistent-mapped code in this plan. ✓
|
||
- [ ] **Spec §2 Decision 8** (defer highlight): InstanceData comment reserves field (Task 5). ✓
|
||
|
||
- [ ] **Spec §4.1 TextureCache changes**: Tasks 2-4. ✓
|
||
- [ ] **Spec §4.2 WbDrawDispatcher changes**: Tasks 7-10. ✓
|
||
- [ ] **Spec §4.3 New shader files**: Task 5. ✓
|
||
- [ ] **Spec §6 Translucency detail**: Tasks 10-11. ✓
|
||
- [ ] **Spec §7 Error handling**: Task 6 (capability + compile fallback) + Task 4 (disposal order). ✓
|
||
- [ ] **Spec §8 Testing**: Task 9 (indirect builder), Task 11 (translucency), Task 13 (perf), Task 14 (visual). ✓
|
||
- [ ] **Spec §9 Risks**: Capability check + fallback paths in Tasks 6+15. ✓
|
||
|
||
No placeholders. No "implement later" tasks. Every step has either code or an exact command.
|
||
|
||
---
|
||
|
||
*End of plan.*
|