phase(N.5): SHIP — modern rendering path on N.4 dispatcher · 55ecec683f - erik/acdream - Forgejo: Beyond coding. We Forge. Snakes.

phase(N.5): SHIP — modern rendering path on N.4 dispatcher

Bindless textures + glMultiDrawElementsIndirect on top of N.4's grouped
pipeline. Per-frame entity rendering: 3 SSBO uploads (instance matrices
@ binding=0, batch data @ binding=1, indirect commands) + 2 indirect
calls (opaque + transparent). Total ~12-15 GL calls per frame for entity
rendering, regardless of scene complexity.

Acceptance gates (spec §8.3):
- [x] Visual identity to N.4 — Task 10 USER GATE PASS (Holtburg courtyard)
      + Task 14 USER GATE PASS (general roaming, no regressions seen)
- [x] CPU dispatcher time ≤ 70% of N.4 — measured 1.23 ms/frame median
      at Holtburg courtyard (1662 groups, ~810 fps); estimated N.4
      hot path ≥2.5 ms/frame; comfortably under threshold
- [x] drawsIssued ≤ 5 per pass (CPU GL calls) — exactly 2 indirect calls
      per frame regardless of scene size
- [x] All tests green — 71/71 in
      FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless
- [x] ACDREAM_USE_WB_FOUNDATION=0 still works — InstancedMeshRenderer
      escape hatch preserved (its own shader path, untouched)
- [ ] GPU rendering time within ±10% of N.4 — DEFERRED to N.6.
      GL_TIME_ELAPSED query polling never reports avail!=1 within the
      same frame; needs double-buffering. CPU is the load-bearing metric.

Plan amendments captured during execution:
- Task 2: parallel Texture2DArray upload path (replacing the original
  "switch globally" framing that would've broken 4 legacy consumers)
- Task 3+4: parallel bindless cache dictionaries (avoiding the GLSL
  type mismatch from sampling a Texture2D handle via sampler2DArray)
- Task 5: preserved mesh_instanced.frag's full SceneLighting UBO + 8
  lights + fog + lightning flash + per-channel clamp
- Task 9: BatchDataPublic Pack=8 (required for safe MemoryMarshal.Cast)

Plan archived at:
  docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Spec at:
  docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md
Perf baseline at:
  docs/plans/2026-05-08-phase-n5-perf-baseline.md
Memory at:
  ~/.claude/.../memory/project_phase_n5_state.md

Files changed: 6 added, 6 modified, 2 deleted. 19 tasks shipped across
~40 commits including amendments + fixups + reviews.

N.6 follow-ups: retire InstancedMeshRenderer entirely; GPU timer query
double-buffering; persistent-mapped buffers if profiling shows the
residual glBufferData hot spot; possible WB atlas adoption for memory
savings on shared content; possible GPU-side culling via compute pre-pass;
per-instance highlight (selection blink) for retail-faithful click feedback
(field reserved in mesh_modern.vert's InstanceData struct).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This commit is contained in:

Erik

2026-05-08 21:14:50 +02:00

parent 77e619d48a

commit 55ecec683f

phase(N.5): SHIP — modern rendering path on N.4 dispatcher

Diff content is not available