Detailed briefing for the next agent picking up Phase N.5 (Modern Rendering Path: bindless textures + glMultiDrawElementsIndirect on N.4's foundation). Covers: - Where N.4 left things (commits, what works, gotchas inherited) - The two-feature pairing (why bindless + indirect together) - Files to read first (WB shaders, our dispatcher, CLAUDE.md cribs) - 8 brainstorm questions to resolve before spec - Spec + plan structure (matching N.4's pattern) - Acceptance criteria - Things to explicitly NOT do Sized for a fresh session to pick up cold without spelunking through months of session history. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
495 lines
22 KiB
Markdown
495 lines
22 KiB
Markdown
# Phase N.5 — Modern Rendering Path — Cold-Start Handoff
|
||
|
||
**Created:** 2026-05-08, immediately after N.4 ship.
|
||
**Audience:** the next agent picking up rendering perf work.
|
||
**Purpose:** give you everything you need to start N.5 cold, without
|
||
spelunking through five months of session history.
|
||
|
||
---
|
||
|
||
## TL;DR
|
||
|
||
N.4 just shipped: WB's `ObjectMeshManager` is now acdream's production
|
||
mesh pipeline, and `WbDrawDispatcher` is the production draw path. It
|
||
works (Holtburg renders correctly, FPS substantially improved over the
|
||
naïve dual-pipeline state we hit during week 4 verification) but it's
|
||
still doing per-group state changes (`glBindTexture`, `glBindBuffer`
|
||
for the IBO, `glDrawElementsInstancedBaseVertexBaseInstance` per group)
|
||
and a fresh `glBufferData` upload per frame.
|
||
|
||
**N.5's job: lift the dispatcher onto WB's modern rendering primitives
|
||
that we're already paying GPU-feature-detection cost for.** Two big
|
||
wins, paired:
|
||
|
||
1. **Bindless textures** (`GL_ARB_bindless_texture`) — WB already
|
||
populates `ObjectRenderBatch.BindlessTextureHandle`. Switch our
|
||
shader to read texture handles from a per-instance attribute
|
||
(`uvec2` → `sampler2D` via the bindless extension). Eliminates
|
||
100% of `glBindTexture` calls.
|
||
2. **Multi-draw indirect** (`glMultiDrawElementsIndirect`) — build a
|
||
buffer of `DrawElementsIndirectCommand` structs (one per group),
|
||
upload once, fire ONE `glMultiDrawElementsIndirect` call per pass.
|
||
The driver pulls everything from the indirect buffer.
|
||
|
||
Together they target a 2-5× CPU win on draw-heavy scenes (Holtburg
|
||
courtyard, Foundry, dense dungeons). They're packaged together because
|
||
both are "modern path" extensions we already gate on, both require
|
||
the same shader rewrite, and they pair naturally — multi-draw indirect
|
||
is a no-op CPU-win without bindless because per-group `glBindTexture`
|
||
calls would still serialize.
|
||
|
||
**Estimated scope: 2-3 weeks.** Plan + spec to be written by the
|
||
brainstorm + spec steps below.
|
||
|
||
---
|
||
|
||
## Where N.4 left things
|
||
|
||
### Branch state
|
||
|
||
If this handoff is being read on `main` after merging the N.4 worktree:
|
||
N.4 commits land at the head of main. The relevant final commits:
|
||
|
||
- `c445364` — N.4 SHIP (flag default-on, plan final, roadmap, memory)
|
||
- `573526d` — perf pass 1-4 (drop dead lookup, sort, cull, hash memo)
|
||
- `7b41efc` — FirstIndex/BaseVertex + Issue #47 + grouped instanced
|
||
- `943652d` — load triggers + `batch.Key.SurfaceId` source
|
||
- `01cff41` — Tasks 22+23 (`WbDrawDispatcher` + side-table)
|
||
|
||
If the worktree branch (`claude/tender-mcclintock-a16839`) hasn't been
|
||
merged yet, that's where the work is. Verify with `git log --oneline`.
|
||
|
||
### What works in N.4
|
||
|
||
- `ACDREAM_USE_WB_FOUNDATION=1` is default-on. WB's `ObjectMeshManager`
|
||
loads, decodes, and uploads every entity mesh. Our existing
|
||
`TextureCache` decodes textures (palette-aware, per-instance overrides
|
||
via `GetOrUploadWithPaletteOverride`).
|
||
- `WbDrawDispatcher.Draw`:
|
||
- Walks visible entities (per-landblock AABB cull + per-entity AABB
|
||
cull + portal visibility)
|
||
- Buckets every (entity × meshRef × batch) tuple by
|
||
`GroupKey(Ibo, FirstIndex, BaseVertex, IndexCount, TextureHandle, Translucency)`
|
||
- Single `glBufferData` upload of all matrices for the frame
|
||
- Per group: `glActiveTexture(0) + glBindTexture(2D, handle) + glBindBuffer(EBO, ibo) + glDrawElementsInstancedBaseVertexBaseInstance(..., FirstInstance)`
|
||
- Two passes: opaque (front-to-back sorted) + translucent
|
||
- 940/948 tests pass (8 pre-existing failures unrelated to rendering).
|
||
- Visual verification at Holtburg passed: scenery + characters render
|
||
correctly with full close-detail geometry (Issue #47 preserved).
|
||
|
||
### What N.5 inherits
|
||
|
||
These are levers N.5 will pull on:
|
||
|
||
- **WB's modern rendering is already active.** `OpenGLGraphicsDevice`
|
||
detected GL 4.3 + bindless on first run; WB's `_useModernRendering`
|
||
is true; every mesh lives in WB's single `GlobalMeshBuffer` (one VAO,
|
||
one VBO, one IBO).
|
||
- **Bindless handles are already populated.** `ObjectRenderBatch.BindlessTextureHandle`
|
||
is non-zero for batches WB owns the texture for. (See gotcha #2
|
||
below for entities with palette overrides — those use acdream's
|
||
`TextureCache` which doesn't expose bindless handles yet.)
|
||
- **The instance VBO is acdream-owned** (`WbDrawDispatcher._instanceVbo`)
|
||
with locations 3-6 patched onto WB's global VAO. Stride 64 bytes
|
||
(one mat4). N.5 expands this to (mat4 + uvec2 handle) = 80 bytes.
|
||
|
||
### Three load-bearing WB API gotchas N.4 surfaced
|
||
|
||
These bit us hard during Task 26 visual verification. Documented in
|
||
CLAUDE.md "WB integration cribs" + plan adjustments 7-9 +
|
||
`memory/project_phase_n4_state.md`. Re-stating here because they
|
||
reshape the design space:
|
||
|
||
1. **`ObjectMeshManager.IncrementRefCount(id)` is NOT lifecycle-aware.**
|
||
It only bumps a usage counter. Mesh loading is fired separately
|
||
via `PrepareMeshDataAsync(id, isSetup)`. The result auto-enqueues
|
||
to `_stagedMeshData` (line 510 of `ObjectMeshManager.cs`); our
|
||
existing `WbMeshAdapter.Tick()` drains it. `WbMeshAdapter.IncrementRefCount`
|
||
already calls `PrepareMeshDataAsync`. **N.5 doesn't need to change
|
||
this — just don't break it.**
|
||
|
||
2. **`ObjectRenderBatch.SurfaceId` is unset.** WB constructs batches
|
||
with `Key = batch.Key` (a `TextureAtlasManager.TextureKey` struct
|
||
that has a `SurfaceId` field) but never populates the top-level
|
||
`SurfaceId` property. Read `batch.Key.SurfaceId`. **N.5 keeps this
|
||
pattern.**
|
||
|
||
3. **WB's modern rendering packs every mesh into ONE global
|
||
VAO/VBO/IBO.** Each batch's `IBO` field points to the global IBO;
|
||
the batch's actual slice is identified by `FirstIndex` (offset into
|
||
IBO, in *indices*) and `BaseVertex` (offset into VBO, in *vertices*).
|
||
N.4's draw uses `glDrawElementsInstancedBaseVertexBaseInstance`
|
||
with those offsets. **N.5's `DrawElementsIndirectCommand` per-group
|
||
record will carry `firstIndex` + `baseVertex` for the same reason.**
|
||
|
||
---
|
||
|
||
## What N.5 is — technical detail
|
||
|
||
### The two-feature pairing
|
||
|
||
**Bindless textures** (`GL_ARB_bindless_texture`):
|
||
- Each texture handle is a 64-bit integer (`uvec2` in GLSL).
|
||
- Shader declares `layout(bindless_sampler) uniform sampler2D ...` or
|
||
receives the handle as a per-vertex-attribute `uvec2`.
|
||
- No `glBindTexture` needed at draw time — the handle IS the binding.
|
||
- Handle generation: `glGetTextureHandleARB(textureId)` followed by
|
||
`glMakeTextureHandleResidentARB(handle)` (the texture must be
|
||
resident on the GPU; non-resident handles produce GPU faults).
|
||
|
||
**Multi-draw indirect** (`glMultiDrawElementsIndirect`):
|
||
- Indirect command struct layout (must match `DrawElementsIndirectCommand`):
|
||
```c
|
||
struct {
|
||
uint count; // index count for this draw
|
||
uint instanceCount; // number of instances
|
||
uint firstIndex; // offset into IBO, in indices
|
||
int baseVertex; // vertex offset into VBO
|
||
uint baseInstance; // first instance ID (offsets per-instance attribs)
|
||
};
|
||
```
|
||
- Build a buffer of N of these structs (one per group), upload once,
|
||
fire one GL call: `glMultiDrawElementsIndirect(mode, indexType, ptr, drawcount, stride)`.
|
||
- The driver issues all N draws in one shot. Effectively zero CPU
|
||
overhead per draw beyond uploading the indirect buffer.
|
||
|
||
**Why pair them.** Multi-draw indirect doesn't let you change uniform
|
||
state between draws. So if textures are bound via `glBindTexture` per
|
||
group, you'd still need N CPU-side setup steps before each indirect
|
||
call — defeating the purpose. Bindless removes that constraint by
|
||
encoding the texture handle as per-instance data the shader reads
|
||
directly. With both, the modern render loop becomes:
|
||
|
||
```
|
||
1. Upload instance buffer (mat4 + uvec2 handle, per-instance) — once per frame
|
||
2. Upload indirect command buffer (one DEIC per group) — once per frame
|
||
3. glBindVertexArray(globalVAO) — once
|
||
4. glMultiDrawElementsIndirect(...) — ONCE per pass
|
||
```
|
||
|
||
That's it. No per-group state changes.
|
||
|
||
### Instance attribute layout
|
||
|
||
Currently (N.4): location 3-6 = mat4 model matrix (16 floats = 64 bytes).
|
||
|
||
N.5 (proposed): location 3-6 = mat4 + location 7 = uvec2 bindless
|
||
handle = 16 floats + 2 uints = 72 bytes (16-aligned to 80 bytes per
|
||
WB's `InstanceData` precedent).
|
||
|
||
Or use std140-aligned struct:
|
||
```c
|
||
struct InstanceData {
|
||
mat4 transform; // locations 3-6
|
||
uvec2 textureHandle; // location 7
|
||
uvec2 _pad; // padding to 80
|
||
};
|
||
```
|
||
|
||
Brainstorm should decide if we copy WB's `InstanceData` struct (Pack=16,
|
||
80 bytes including CellId/Flags fields we don't use) or define our own
|
||
minimal version. The 80-byte stride matches WB's so global VAO state
|
||
configured by WB stays compatible if the legacy WB draw path ever runs.
|
||
|
||
### Per-instance entity texture handles
|
||
|
||
Here's the wrinkle. N.4 uses `WbDrawDispatcher.ResolveTexture` to map
|
||
each (entity, batch) to a GL texture handle:
|
||
|
||
- Tree (no overrides): `_textures.GetOrUpload(surfaceId)` → 2D texture handle
|
||
- NPC with palette override: `_textures.GetOrUploadWithPaletteOverride(...)` → composite-cached 2D texture handle
|
||
- Anything with surface override: `_textures.GetOrUploadWithOrigTextureOverride(...)` → composite-cached 2D texture handle
|
||
|
||
Those are all `GLuint` 32-bit GL texture *names*, not bindless handles.
|
||
**N.5 needs `TextureCache` to publish bindless handles for everything
|
||
it owns, not just WB-owned textures.**
|
||
|
||
Implementation sketch:
|
||
- `TextureCache` adds a parallel cache keyed identically but storing
|
||
64-bit bindless handles. On first request, generate via
|
||
`glGetTextureHandleARB(textureId)` + make resident.
|
||
- New API: `GetBindlessHandle(uint surfaceId, ...)` returns the handle.
|
||
- Or: change every `GetOrUpload*` method to return both the GL name
|
||
and the bindless handle (or just the handle; let GL name fall out
|
||
if anyone needs it later).
|
||
|
||
WB's `ObjectRenderBatch.BindlessTextureHandle` covers the atlas-tier
|
||
case. For per-instance entities, we use `TextureCache`'s handle.
|
||
|
||
### The new shader
|
||
|
||
Reuse WB's `StaticObjectModern.vert` / `StaticObjectModern.frag` as a
|
||
template. Read those files cold. They already do bindless + the
|
||
instance-data layout. Adapt to acdream's `mesh_instanced.vert/frag`
|
||
conventions:
|
||
|
||
- Keep the `uViewProjection` uniform, lighting UBO at binding=1, fog
|
||
uniforms.
|
||
- Add `#version 430 core` + `#extension GL_ARB_bindless_texture : require`.
|
||
- Replace `uniform sampler2D uDiffuse` with a `uvec2` per-vertex
|
||
attribute (location 7) → reconstruct sampler in vertex shader OR
|
||
pass through to fragment via flat varying.
|
||
- Drop `uTranslucencyKind` uniform, OR keep it (still set per-pass —
|
||
multi-draw indirect doesn't break uniforms; only state that varies
|
||
per-draw is the constraint).
|
||
|
||
### Translucency
|
||
|
||
Multi-draw indirect can't change blend state mid-draw. Solution:
|
||
**still use two passes** (opaque + translucent), but within translucent
|
||
keep the per-blendfunc sub-passes (additive, alpha-blend, inv-alpha).
|
||
Three sub-passes within translucent. Each sub-pass = one
|
||
`glMultiDrawElementsIndirect` over its filtered groups.
|
||
|
||
Or: if perf allows, fold all four blend modes into the shader via
|
||
per-instance blendmode int, sort all translucent groups by blendmode
|
||
in the indirect buffer, switch blend state at sub-pass boundaries.
|
||
Brainstorm decides the cleanest pattern.
|
||
|
||
---
|
||
|
||
## Files to read before brainstorming
|
||
|
||
In rough order:
|
||
|
||
1. **N.4 plan + spec** — `docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`
|
||
(status: Final). Adjustments 7-10 capture the gotchas. Spec at
|
||
`docs/superpowers/specs/2026-05-08-phase-n4-rendering-foundation-design.md`.
|
||
|
||
2. **N.4 dispatcher source** — `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`.
|
||
This is what you're modifying. Read end-to-end.
|
||
|
||
3. **WB's modern rendering shaders** — `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Shaders/StaticObjectModern.vert`
|
||
+ `StaticObjectModern.frag`. The template you're adapting from.
|
||
|
||
4. **WB's `ObjectMeshManager.UploadGfxObjMeshData`** — lines ~1654-1780
|
||
of `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/ObjectMeshManager.cs`.
|
||
Shows how WB sets up the modern path's VBO/IBO/VAO. Especially note
|
||
how it patches in instance attribute slots (locations 3-6) on the
|
||
global VAO and configures location 7+ for bindless handles.
|
||
|
||
5. **WB's `ObjectRenderBatch`** — same file, lines ~166-184. Note the
|
||
`BindlessTextureHandle` field — already populated when `_useModernRendering`
|
||
is on.
|
||
|
||
6. **Our `TextureCache`** — `src/AcDream.App/Rendering/TextureCache.cs`.
|
||
Three composite caches: by surface id, by surface+origTex, by
|
||
surface+origTex+palette. N.5 adds parallel bindless-handle caches.
|
||
|
||
7. **CLAUDE.md "WB integration cribs"** section. Lines ~28-80. The
|
||
three gotchas + the integration architecture in plain language.
|
||
|
||
8. **Memory: `project_phase_n4_state.md`** — same content from a
|
||
different angle. Reading both helps lock in the gotchas.
|
||
|
||
---
|
||
|
||
## Brainstorm questions
|
||
|
||
These are the questions to resolve in the brainstorm step. Don't
|
||
prejudge them — bring them to the user with options + recommendation:
|
||
|
||
1. **Instance attribute layout.** Match WB's `InstanceData` struct
|
||
(80 bytes including CellId/Flags fields we don't use) for global
|
||
VAO compatibility, or define a minimal acdream-specific version
|
||
(mat4 + handle = ~72 bytes padded to 80)?
|
||
|
||
2. **Bindless handle generation strategy.**
|
||
- At texture upload time? (Eager — every texture that lands in
|
||
`TextureCache` gets a handle. Memory cost ~per-texture state.)
|
||
- On first draw lookup? (Lazy — cache fills as scene exercises
|
||
content. Possible first-use stall.)
|
||
- At spawn time via the spawn adapter? (Tied to lifecycle. Cleanest
|
||
but requires touching the spawn path.)
|
||
|
||
3. **Translucent pass structure.** Three sub-indirect-draws (one per
|
||
blend mode) or a single sorted indirect buffer with per-instance
|
||
blend mode + state-flip at sub-pass boundaries? Or: just iterate
|
||
per-group like N.4 for translucent only (translucent groups are a
|
||
small fraction of total)?
|
||
|
||
4. **Persistent-mapped indirect + instance buffers.** Use
|
||
`GL_ARB_buffer_storage` + `MAP_PERSISTENT_BIT | MAP_COHERENT_BIT`?
|
||
Triple-buffered ring + sync object? Or stick with `glBufferData`
|
||
(still one upload per frame, just larger)? Persistent mapping is
|
||
~2-5% per-frame win in our context but adds buffer-management
|
||
complexity.
|
||
|
||
5. **Shader unification.** Keep `mesh_instanced` for legacy + add
|
||
`mesh_indirect` for modern, or replace `mesh_instanced` entirely?
|
||
Replacement requires the legacy `InstancedMeshRenderer` (escape
|
||
hatch under `ACDREAM_USE_WB_FOUNDATION=0`) to also use the new
|
||
shader, which... probably doesn't matter if we delete legacy in
|
||
N.6 anyway. Brainstorm.
|
||
|
||
6. **Conformance test strategy.** N.4 used visual verification at
|
||
Holtburg as the gate. N.5's gate is "no visual regression vs N.4
|
||
AND measurable CPU win." How do we measure CPU? `[WB-DIAG]`
|
||
counters give draw count + group count; we need frame-time
|
||
counters too. Add to the dispatcher? Use a profiler?
|
||
|
||
7. **Per-instance entity bindless.** `TextureCache.GetOrUpload*`
|
||
returns a GL name. The dispatcher (or `TextureCache` itself) needs
|
||
to convert that to a bindless handle. Design questions:
|
||
- Where does the conversion happen?
|
||
- When is the texture made resident? (Residency is global state;
|
||
too many resident textures hits driver limits.)
|
||
- What about palette/surface overrides — same caching key as the
|
||
name, just a parallel handle dictionary?
|
||
|
||
8. **Escape hatch.** N.4 keeps `ACDREAM_USE_WB_FOUNDATION=0` as a
|
||
fallback. N.5 needs to decide: does the new shader REPLACE the
|
||
N.4 dispatcher's draw path (so flag-on means N.5 modern path,
|
||
flag-off means legacy `InstancedMeshRenderer`)? Or do we add a
|
||
separate flag (`ACDREAM_USE_MODERN_DRAW`) so users can toggle
|
||
N.4 vs N.5 vs legacy independently? Three-way flag is more
|
||
complex but useful for A/B during rollout.
|
||
|
||
---
|
||
|
||
## Spec structure
|
||
|
||
After the brainstorm, the spec doc covers:
|
||
|
||
1. **Architecture diagram** — how `WbDrawDispatcher` changes shape.
|
||
Where the indirect buffer lives. Where bindless handles flow from.
|
||
2. **Instance data layout** — exact struct, byte offsets, GL attribute
|
||
pointer setup.
|
||
3. **TextureCache changes** — new methods, new cache, residency
|
||
policy.
|
||
4. **Shader files** — name(s), version, extensions, in/out variables.
|
||
5. **Conformance tests** — what to write, what coverage to claim.
|
||
6. **Acceptance criteria** — visual identity to N.4 + measured CPU
|
||
delta.
|
||
7. **Risks** — driver bugs in bindless / indirect, residency limits,
|
||
shader compile issues on weird GPUs, the legacy escape hatch
|
||
breaking.
|
||
|
||
Spec lives at: `docs/superpowers/specs/2026-05-XX-phase-n5-modern-rendering-design.md`.
|
||
|
||
## Plan structure
|
||
|
||
After the spec, the plan doc lays out the week-by-week task list.
|
||
Match N.4's plan structure (living document, task checkboxes, commit
|
||
SHAs appended, adjustments documented inline). Plan lives at:
|
||
`docs/superpowers/plans/2026-05-XX-phase-n5-modern-rendering.md`.
|
||
|
||
Suggested initial breakdown (brainstorm + spec will refine):
|
||
|
||
- **Week 1** — Plumbing: bindless handle generation in `TextureCache`,
|
||
shader rewrite (compile + bind), instance-attrib layout updated to
|
||
mat4+handle. Dispatcher still uses per-group draws but reads
|
||
textures bindless. Validate: visual identical to N.4.
|
||
- **Week 2** — Indirect: build `DrawElementsIndirectCommand` buffer
|
||
per frame, switch to `glMultiDrawElementsIndirect`. Three-pass
|
||
translucent (or whatever brainstorm decides). Validate: visual
|
||
identical, draw-call count drops to 2-4 per frame.
|
||
- **Week 3** — Polish + ship: persistent-mapped buffers if brainstorm
|
||
voted yes, profiler/counters, visual verification, flag flip, plan
|
||
finalization.
|
||
|
||
---
|
||
|
||
## Acceptance criteria for the whole phase
|
||
|
||
- Visual output identical to N.4 (no character regressions, no
|
||
scenery missing, no z-fighting introduced)
|
||
- `[WB-DIAG]` shows `drawsIssued` ≤ ~5 per frame (down from N.4's
|
||
few hundred)
|
||
- Frame time measurably lower in dense scenes (specify what scenes
|
||
to test in the spec — probably Holtburg courtyard + Foundry
|
||
interior)
|
||
- All tests still green (940/948 + any new conformance tests)
|
||
- `ACDREAM_USE_WB_FOUNDATION=0` escape hatch still works
|
||
- Plan doc finalized, roadmap updated, memory captured if N.5
|
||
surfaces durable lessons (it almost certainly will — bindless
|
||
+ indirect both have well-known driver gotchas)
|
||
|
||
---
|
||
|
||
## What you'll be doing in the first 30 minutes
|
||
|
||
1. Read this handoff in full.
|
||
2. Read CLAUDE.md "WB integration cribs" section.
|
||
3. Read `WbDrawDispatcher.cs` end-to-end.
|
||
4. Skim WB's `StaticObjectModern.vert/frag` + `ObjectMeshManager.UploadGfxObjMeshData`
|
||
to ground the reference.
|
||
5. Verify build is green: `dotnet build`.
|
||
6. Verify N.4 ship is intact: `dotnet test --filter "FullyQualifiedName~Wb|MatrixComposition"`
|
||
should produce 60 passing tests, 0 failures.
|
||
7. Invoke the `superpowers:brainstorming` skill with the user. Walk
|
||
through the 8 brainstorm questions above. Capture decisions in a
|
||
spec.
|
||
8. Write the spec at the path above.
|
||
9. Write the plan at the path above.
|
||
10. Begin Week 1 implementation per the plan.
|
||
|
||
Don't skip the brainstorm. Multi-draw indirect + bindless have several
|
||
real driver-compatibility / API-shape decisions that need user input,
|
||
not "the agent makes a call and goes." This phase is structurally the
|
||
same shape as N.4 — brainstorm → spec → plan → tasks-with-checkboxes →
|
||
commits-update-checkboxes → final SHIP commit.
|
||
|
||
---
|
||
|
||
## Things to NOT do
|
||
|
||
- **Don't delete the legacy `InstancedMeshRenderer`.** It's the N.4
|
||
escape hatch. N.6 retires it after N.5 is proven default-on.
|
||
- **Don't fork WB.** N.4 deliberately avoided fork patches by using
|
||
the side-table pattern (`AcSurfaceMetadataTable`). Stay on that
|
||
path. If you need data WB doesn't expose, add a side-table or
|
||
decode it yourself from dats.
|
||
- **Don't try to make per-instance entities use WB's `TextureAtlasManager`.**
|
||
That's N.6+ territory. acdream's `TextureCache` owns palette/surface
|
||
overrides because WB's atlas is keyed by `(surfaceId, paletteId,
|
||
stippling, isSolid)` and our overrides don't fit cleanly. Bindless
|
||
handles let us escape that mismatch — handles for both atlas-tier
|
||
AND per-instance-tier textures, no atlas adoption needed.
|
||
- **Don't skip visual verification.** N.4 surfaced three bugs at
|
||
visual verification that no test caught. Don't trust "build green +
|
||
tests pass" — exercise the rendering path with the local ACE server.
|
||
- **Don't extend the phase scope.** N.5 is bindless + indirect on
|
||
the existing rendering pipeline. Texture array atlas, GPU-side
|
||
culling, terrain wiring — all of those are subsequent phases. If
|
||
the brainstorm tries to expand, push back.
|
||
|
||
---
|
||
|
||
## Reference: the N.4 dispatcher flow you're modifying
|
||
|
||
```
|
||
Draw(camera, landblockEntries, frustum, ...) {
|
||
// Phase 1: walk entities, build groups
|
||
foreach (entity, meshRef, batch) {
|
||
cull, classify into _groups[GroupKey]
|
||
}
|
||
|
||
// Phase 2: lay matrices contiguously
|
||
// Phase 3: glBufferData(_instanceVbo, allMatrices)
|
||
// Phase 4: bind global VAO once
|
||
// Phase 5: opaque pass (sorted)
|
||
foreach (group in _opaqueDraws) {
|
||
glBindTexture(group.handle)
|
||
glBindBuffer(EBO, group.ibo)
|
||
glDrawElementsInstancedBaseVertexBaseInstance(...)
|
||
}
|
||
// Phase 6: translucent pass
|
||
}
|
||
```
|
||
|
||
After N.5, Phases 5 and 6 collapse to:
|
||
|
||
```
|
||
glBindBuffer(DRAW_INDIRECT_BUFFER, _opaqueIndirect)
|
||
glMultiDrawElementsIndirect(GL_TRIANGLES, GL_UNSIGNED_SHORT, 0, opaqueGroups.Count, sizeof(DEIC))
|
||
glBindBuffer(DRAW_INDIRECT_BUFFER, _translucentIndirect)
|
||
// 3 sub-calls for translucent or 1 if shader-folded
|
||
glMultiDrawElementsIndirect(...)
|
||
```
|
||
|
||
That's the destination. Get there cleanly.
|
||
|
||
Good luck. Holler at the user if any of the brainstorm questions feel
|
||
genuinely ambiguous after reading the references — they care about
|
||
this phase landing right and will engage on design questions.
|