Commit graph

13 commits

Author SHA1 Message Date
Erik
dcae2b6b94 phase(N.5): retirement amendment — InstancedMeshRenderer + StaticMeshRenderer + WbFoundationFlag deleted
Final cross-cutting review of N.5 found that Task 15's deletion of
mesh_instanced.vert/.frag left InstancedMeshRenderer orphaned —
ACDREAM_USE_WB_FOUNDATION=0 silently rendered terrain+sky only with
no entities. The SHIP commit's "[x] ACDREAM_USE_WB_FOUNDATION=0 still
works" claim was inaccurate.

Resolution: formal retirement of the legacy renderer path within N.5
instead of deferring to N.6.

Deleted:
- src/AcDream.App/Rendering/InstancedMeshRenderer.cs
- src/AcDream.App/Rendering/StaticMeshRenderer.cs
- src/AcDream.App/Rendering/Wb/WbFoundationFlag.cs

GameWindow simplified — capability detection is unconditional, missing
bindless throws NotSupportedException with a clear message at startup.
WbDrawDispatcher + mesh_modern shader load are mandatory after init.
No escape hatch.

GpuWorldState simplified — WbFoundationFlag.IsEnabled guards on
AddLandblock/RemoveLandblock removed; adapter calls are unconditional
when the adapter is non-null.

PendingSpawnIntegrationTests updated — WbFoundationFlag.ForTestsOnly_ForceEnable
static ctor removed (flag is gone; adapter calls are unconditional).

The ApplyLoadedTerrain physics-data loop was also simplified: the
EnsureUploaded sub-loop that fed InstancedMeshRenderer is gone;
_pendingCellMeshes is now explicitly cleared to prevent unbounded
accumulation (the worker thread still populates it, but WB handles
EnvCell geometry through its own pipeline).

Spec §2 Decision 5 + §10 Out-of-Scope updated. Plan ship-amendment
section added. Roadmap updated (N.5 ships with retirement; N.6 scope
narrowed to perf-only). CLAUDE.md "WB integration cribs" updated.
Perf baseline doc updated. WbDrawDispatcher class summary docstring
corrected to describe the as-shipped SSBO + multi-draw-indirect path.
ISSUES.md #51 updated (terrain not in N.5 scope; deferred to N.7).

Bindless support is now a hard requirement. Modern desktop GPUs
universally expose GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters;
if a user hits the NotSupportedException, that's a real bug report
worth investigating, not a silent fallback.

Build: 0 errors, 0 warnings. Tests: 71/71 (Wb+MatrixComposition+TextureCacheBindless filter).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:01:36 +02:00
Erik
d114dca1e8 phase(N.5) Task 12: CPU stopwatch + GL_TIME_ELAPSED queries in [WB-DIAG]
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
queries (opaque + transparent) wrapping each glMultiDrawElementsIndirect,
polled non-blocking via QueryResultAvailable on the next frame.

Sample window is 256 frames per signal; median + p95 reported.
Numbers populate the SHIP commit's perf table at Task 19.

Silk.NET naming note: GL_TIME_ELAPSED queries use QueryTarget.TimeElapsed
(confirmed present in Silk.NET.OpenGL 2.23.0 DLL). The 64-bit result is
read via GetQueryObject(..., out ulong) which dispatches to
glGetQueryObjectui64v; the int overload (glGetQueryObjectiv) is used for
the ResultAvailable poll, matching WorldBuilder's VisibilityManager pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:57:26 +02:00
Erik
cfe1ca3151 phase(N.5) Task 11: translucency partition contract test
Locks in Decision 2 (Opaque + ClipMap → opaque indirect; AlphaBlend +
Additive + InvAlpha → transparent indirect). Catches future refactors
that drift the partition — silent visual regression otherwise (groups
rendered in the wrong pass with the wrong blend state).

Adds public static IsOpaquePublic shim on WbDrawDispatcher; the
underlying IsOpaque stays private.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:53:36 +02:00
Erik
f533414edf phase(N.5) Task 10: glMultiDrawElementsIndirect dispatch — visual verified
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
Per-frame uploads three SSBOs:
- _instanceSsbo @ binding=0 (mat4 per instance, indexed by gl_BaseInstanceARB + gl_InstanceID)
- _batchSsbo @ binding=1 (BatchData per group, indexed by gl_DrawIDARB)
- _indirectBuffer (DrawElementsIndirectCommand[] — opaque first, transparent second)

GameWindow swaps the shader load to mesh_modern when _bindlessSupport
is non-null. Capability detection + shader load now run in the right
order (capability before TextureCache + before Shader).

Deletes the obsolete DrawGroup stub, EnsureInstanceAttribs, _instanceBuffer,
_patchedVaos. ClassifyBatches + ResolveTexture already migrated in
Task 8 to use ulong bindless handles.

BuildIndirectArrays (Task 9) wired in: _opaqueDraws + _translucentDraws
are flattened into IndirectGroupInput[], laid out via the helper into
contiguous indirect commands + parallel BatchData[]. opaqueByteOffset=0,
transparentByteOffset = opaqueCount × DrawCommandStride.

Visual verification (USER GATE) PASS: Holtburg courtyard renders
identical to N.4 — terrain, scenery, characters, NPCs all visible
without artifacts. [N.5] modern path capabilities present + mesh_modern
shader loaded log lines confirm the boot path. [WB-DIAG] hot-path
counters show healthy entity/draw activity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:51:49 +02:00
Erik
b163c53622 phase(N.5) Task 9 fixup: layout assertion + DrawCommandStride const
Code quality review caught:
- sizeofDEIC was a local; promoted to public const DrawCommandStride
  so tests can reference it symbolically.
- BatchDataPublic layout invariant (size + field offsets) wasn't
  asserted in tests. Added BatchDataPublic_LayoutMatchesPrivateBatchData
  + DrawCommandStride_MatchesStructSize tests to gate Task 10's
  MemoryMarshal.Cast<BatchData, BatchDataPublic> safety.
- Plan doc updated: BatchDataPublic spec was Pack=4 (wrong — must
  match private BatchData's Pack=8 for the cast to work). Implementation
  was already correct; plan now matches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:42:49 +02:00
Erik
9a7a250b62 phase(N.5) Task 9: BuildIndirectArrays — CPU layout for indirect dispatch
Pure CPU helper that lays out a group list into a contiguous indirect
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
opaque section first, transparent section second. Returns counts +
byte offset for the transparent section.

Tests cover: spec §5 walk-through layout; empty group list edge case;
ClipMap classification (treated as opaque, not transparent).

Static + public so tests can exercise without a GL context. Task 10
wires it into the rewritten Draw() method.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:38:22 +02:00
Erik
424d7b9015 phase(N.5) Task 8: InstanceGroup + GroupKey carry bindless handle + layer
Replaces uint TextureHandle (32-bit GL name) with ulong
BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture
return type. Adds TextureLayer (always 0 for per-instance composites,
becomes meaningful when WB atlas is adopted in N.6).

ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants —
these return Texture2DArray-backed bindless handles (Task 3 work).

DrawGroup body throws NotImplementedException — Task 10 rewrites the
whole Draw() method to use glMultiDrawElementsIndirect, which makes
DrawGroup obsolete. CPU-only tests don't invoke DrawGroup so the build
+ test gates stay green; visual launch fails until Task 10 (intentional).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:32:38 +02:00
Erik
1b6995d2df phase(N.5) Task 7 fixup: BatchData Pack=8 for ulong alignment
Code quality review caught that BatchData uses Pack=4 but contains a
ulong field. With the current field order (TextureHandle first), offset
0 is always 8-byte aligned so std430 works. But adding a 4-byte field
before TextureHandle without bumping Pack would silently misalign the
GPU struct. Pack=8 makes the alignment requirement explicit and adds
a comment documenting expected std430 offsets.

No runtime change — current offsets (0/8/12) are identical under both
Pack values for this field order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:29:58 +02:00
Erik
86c471d2d1 phase(N.5) Task 7: dispatcher SSBO + indirect buffer infrastructure
Adds DrawElementsIndirectCommand struct (20-byte layout for
glMultiDrawElementsIndirect). Replaces _instanceVbo field on
WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]),
_batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport
constructor parameter — non-null required since the dispatcher is only
constructed when WB foundation is on (which implies bindless is present
per Task 6 capability detection).

Existing Draw() method substitutes _instanceVbo -> _instanceSsbo for
compile. Behavior is temporarily wrong (SSBO bound as ArrayBuffer for
per-vertex attribs); Tasks 9-10 fully rewrite the draw loop and the
per-frame uploads to use BindBufferBase + glMultiDrawElementsIndirect.

GameWindow construction site updated to add _bindlessSupport guard and
pass it as the new last argument to the constructor. Dispatcher is only
constructed when bindless is guaranteed present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:25:29 +02:00
Erik
573526dae5 phase(N.4): WbDrawDispatcher perf pass — sort, cull, hash memoization
Four small wins on top of the grouped-instanced refactor.

1. Drop unused animState lookup. Was a side-effect-free
   _entitySpawnAdapter.GetState call per per-instance entity, made
   redundant by the Issue #47 fix that trusts MeshRefs.

2. Front-to-back sort opaque groups. Squared distance from camera to
   each group's first-instance translation; ascending sort. Lets the
   GPU's depth test reject fragments behind closer geometry — real
   win on dense scenes (Holtburg courtyard, Foundry interior).

3. Per-entity AABB frustum cull. 5m-radius AABB check per entity
   before walking parts. Skips work for distant entities even when
   their landblock is partially visible. Animated entities (other
   characters, NPCs, monsters) bypass — they always need per-frame
   work for animation regardless. Conservative radius covers typical
   entity bounds; large outliers stay landblock-culled.

4. Memoize palette hash per entity. TextureCache.HashPaletteOverride
   is now internal; new GetOrUploadWithPaletteOverride overload takes
   a precomputed hash. The dispatcher computes it ONCE per entity and
   reuses across every (part, batch) lookup, avoiding the per-batch
   FNV-1a fold over SubPalettes. Trees / scenery without palette
   overrides skip entirely (palHash stays 0).

Visual output unchanged; FPS up further, especially in dense scenes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 17:51:03 +02:00
Erik
7b41efc281 phase(N.4): WbDrawDispatcher — FirstIndex/BaseVertex + Issue #47 + grouped instanced draws
Three bugs surfaced and resolved during Task 26 visual verification.

1. **No-scenery + exploded characters**: WB's modern rendering path
   (GL 4.3 + bindless) packs every mesh into a single global VAO/VBO/IBO
   (GlobalMeshBuffer). Each batch references its slice via FirstIndex
   (offset into IBO) + BaseVertex (offset into VBO). The dispatcher's
   DrawElementsInstanced(indices=0) read offset 0 of the global IBO
   for every entity — drawing the same first triangle from every
   entity position. Switched to glDrawElementsInstancedBaseVertex(
   BaseInstance) with the batch's offsets. Scenery + connected
   characters now render correctly.

2. **Issue #47 character regression**: Adjustment 6 stored
   AnimPartChanges on WorldEntity.PartOverrides using the raw
   server-sent NewModelId (no degrade resolver applied). The
   dispatcher's animState.ResolvePartGfxObj override path then
   clobbered MeshRefs (which GameWindow's spawn code correctly
   resolves to close-detail meshes via GfxObjDegradeResolver).
   Result: humanoids drew low-detail (~14 verts/17 polys) base
   meshes instead of close-detail (~32 verts/60 polys), losing
   bicep / shoulder / back geometry. Fix: trust MeshRefs as the
   source of truth and don't re-apply animState overrides at draw
   time. AnimatedEntityState's overrides only matter for hot-swap
   appearance updates (0xF625) which today rebuild MeshRefs anyway.

3. **Performance — sub-100 FPS on Holtburg**: per-entity
   single-instance draws meant ~16K glDraw calls/frame plus a
   64-byte glBufferSubData per call. Refactored to grouped
   instanced rendering: bucket all (entity, batch) pairs by
   GroupKey(Ibo, FirstIndex, BaseVertex, IndexCount, TextureHandle,
   Translucency); upload all matrices in ONE BufferData call;
   one glDrawElementsInstancedBaseVertexBaseInstance per group
   with BaseInstance pointing at the group's slice in the shared
   instance VBO. Down from ~16K to a few hundred draws/frame
   (~30× fewer). Bind VAO once per frame (modern WB shares one
   global VAO). Removed redundant per-draw VertexAttribPointer
   (VAO captures that state).

Result: Holtburg renders correctly with characters showing full
detail; FPS climbed substantially. Two more bugs (mesh loading
+ batch.Key.SurfaceId) were fixed in the prior commit (943652d).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 17:39:02 +02:00
Erik
943652dc97 phase(N.4) Tasks 22+23 fixup: trigger WB mesh loads + correct SurfaceId source
Task 26 visual verification surfaced three bugs in the dispatcher.
Two are fixed here; the third is documented as a remaining issue.

1. WB's IncrementRefCount only bumps a usage counter — it does NOT
   trigger mesh loading. Fixed in WbMeshAdapter.IncrementRefCount:
   call PrepareMeshDataAsync(id, isSetup: false) on first registration.
   Result auto-enqueues to _stagedMeshData (line 510 of WB's
   ObjectMeshManager) which Tick() drains onto the GPU.

2. EntitySpawnAdapter never registered per-instance entity meshes
   with WB. LandblockSpawnAdapter only registers atlas-tier
   (ServerGuid == 0); per-instance entities fell through. Fixed by
   adding optional IWbMeshAdapter constructor param + tracking unique
   GfxObj ids per server-guid for IncrementRefCount on OnCreate /
   DecrementRefCount on OnRemove.

3. WbDrawDispatcher.ResolveTexture used batch.SurfaceId which WB
   never populates (line 1746 of ObjectMeshManager only sets
   batch.Key — the TextureKey struct that has SurfaceId). Switched
   to batch.Key.SurfaceId.

Plus diagnostic counters (ACDREAM_WB_DIAG=1) for entity-seen / drawn
/ mesh-missing / draws-issued counts.

Status: with these fixes the dispatcher now issues real draw calls
(~16K/frame, validated via diagnostic). However visual verification
shows characters appear "exploded" (parts spaced too far apart) and
scenery (trees/rocks/fences/buildings) does not appear. Root cause
analysis pending — Adjustment 7 in the plan documents the deferred
work. Flag stays default-off; legacy renderer remains the
production path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 15:50:21 +02:00
Erik
01cff4144f phase(N.4) Tasks 22+23: WbDrawDispatcher + surface metadata side-table
WbDrawDispatcher draws all entities through WB's ObjectRenderData
(VAO/VBO per GfxObj, per-batch IBO) using acdream's TextureCache for
texture resolution. Two-pass rendering (opaque+ClipMap, then
translucent) matching the existing InstancedMeshRenderer pattern.
Per-entity single-instance drawing for N.4 simplicity — true
instancing grouping deferred to N.6.

Atlas-tier entities: mesh from WB, texture from TextureCache via
batch SurfaceId. Per-instance-tier entities: AnimatedEntityState
drives part overrides + hidden-parts, palette/surface overrides
resolve through TextureCache's composite-key caches.

Side-table population (Task 23 folded in): WbMeshAdapter now takes
DatCollection and populates AcSurfaceMetadataTable on first
IncrementRefCount per GfxObj. The side-table provides TranslucencyKind
(critical for ClipMap alpha-test on vegetation) plus Luminosity,
Diffuse, SurfOpacity, NeedsUvRepeat, DisableFog for sky-pass and
lighting.

GameWindow wiring: when WbFoundationFlag is enabled, WbDrawDispatcher
draws everything and InstancedMeshRenderer is skipped. Flag-off path
is unchanged.

Matrix composition: restPose * animOverride * entityWorld, matching
the spec. Three MatrixCompositionTests verify the contract.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 15:30:33 +02:00