Commit graph

6 commits

Author SHA1 Message Date
Erik
1b6995d2df phase(N.5) Task 7 fixup: BatchData Pack=8 for ulong alignment
Code quality review caught that BatchData uses Pack=4 but contains a
ulong field. With the current field order (TextureHandle first), offset
0 is always 8-byte aligned so std430 works. But adding a 4-byte field
before TextureHandle without bumping Pack would silently misalign the
GPU struct. Pack=8 makes the alignment requirement explicit and adds
a comment documenting expected std430 offsets.

No runtime change — current offsets (0/8/12) are identical under both
Pack values for this field order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:29:58 +02:00
Erik
86c471d2d1 phase(N.5) Task 7: dispatcher SSBO + indirect buffer infrastructure
Adds DrawElementsIndirectCommand struct (20-byte layout for
glMultiDrawElementsIndirect). Replaces _instanceVbo field on
WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]),
_batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport
constructor parameter — non-null required since the dispatcher is only
constructed when WB foundation is on (which implies bindless is present
per Task 6 capability detection).

Existing Draw() method substitutes _instanceVbo -> _instanceSsbo for
compile. Behavior is temporarily wrong (SSBO bound as ArrayBuffer for
per-vertex attribs); Tasks 9-10 fully rewrite the draw loop and the
per-frame uploads to use BindBufferBase + glMultiDrawElementsIndirect.

GameWindow construction site updated to add _bindlessSupport guard and
pass it as the new last argument to the constructor. Dispatcher is only
constructed when bindless is guaranteed present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:25:29 +02:00
Erik
573526dae5 phase(N.4): WbDrawDispatcher perf pass — sort, cull, hash memoization
Four small wins on top of the grouped-instanced refactor.

1. Drop unused animState lookup. Was a side-effect-free
   _entitySpawnAdapter.GetState call per per-instance entity, made
   redundant by the Issue #47 fix that trusts MeshRefs.

2. Front-to-back sort opaque groups. Squared distance from camera to
   each group's first-instance translation; ascending sort. Lets the
   GPU's depth test reject fragments behind closer geometry — real
   win on dense scenes (Holtburg courtyard, Foundry interior).

3. Per-entity AABB frustum cull. 5m-radius AABB check per entity
   before walking parts. Skips work for distant entities even when
   their landblock is partially visible. Animated entities (other
   characters, NPCs, monsters) bypass — they always need per-frame
   work for animation regardless. Conservative radius covers typical
   entity bounds; large outliers stay landblock-culled.

4. Memoize palette hash per entity. TextureCache.HashPaletteOverride
   is now internal; new GetOrUploadWithPaletteOverride overload takes
   a precomputed hash. The dispatcher computes it ONCE per entity and
   reuses across every (part, batch) lookup, avoiding the per-batch
   FNV-1a fold over SubPalettes. Trees / scenery without palette
   overrides skip entirely (palHash stays 0).

Visual output unchanged; FPS up further, especially in dense scenes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 17:51:03 +02:00
Erik
7b41efc281 phase(N.4): WbDrawDispatcher — FirstIndex/BaseVertex + Issue #47 + grouped instanced draws
Three bugs surfaced and resolved during Task 26 visual verification.

1. **No-scenery + exploded characters**: WB's modern rendering path
   (GL 4.3 + bindless) packs every mesh into a single global VAO/VBO/IBO
   (GlobalMeshBuffer). Each batch references its slice via FirstIndex
   (offset into IBO) + BaseVertex (offset into VBO). The dispatcher's
   DrawElementsInstanced(indices=0) read offset 0 of the global IBO
   for every entity — drawing the same first triangle from every
   entity position. Switched to glDrawElementsInstancedBaseVertex(
   BaseInstance) with the batch's offsets. Scenery + connected
   characters now render correctly.

2. **Issue #47 character regression**: Adjustment 6 stored
   AnimPartChanges on WorldEntity.PartOverrides using the raw
   server-sent NewModelId (no degrade resolver applied). The
   dispatcher's animState.ResolvePartGfxObj override path then
   clobbered MeshRefs (which GameWindow's spawn code correctly
   resolves to close-detail meshes via GfxObjDegradeResolver).
   Result: humanoids drew low-detail (~14 verts/17 polys) base
   meshes instead of close-detail (~32 verts/60 polys), losing
   bicep / shoulder / back geometry. Fix: trust MeshRefs as the
   source of truth and don't re-apply animState overrides at draw
   time. AnimatedEntityState's overrides only matter for hot-swap
   appearance updates (0xF625) which today rebuild MeshRefs anyway.

3. **Performance — sub-100 FPS on Holtburg**: per-entity
   single-instance draws meant ~16K glDraw calls/frame plus a
   64-byte glBufferSubData per call. Refactored to grouped
   instanced rendering: bucket all (entity, batch) pairs by
   GroupKey(Ibo, FirstIndex, BaseVertex, IndexCount, TextureHandle,
   Translucency); upload all matrices in ONE BufferData call;
   one glDrawElementsInstancedBaseVertexBaseInstance per group
   with BaseInstance pointing at the group's slice in the shared
   instance VBO. Down from ~16K to a few hundred draws/frame
   (~30× fewer). Bind VAO once per frame (modern WB shares one
   global VAO). Removed redundant per-draw VertexAttribPointer
   (VAO captures that state).

Result: Holtburg renders correctly with characters showing full
detail; FPS climbed substantially. Two more bugs (mesh loading
+ batch.Key.SurfaceId) were fixed in the prior commit (943652d).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 17:39:02 +02:00
Erik
943652dc97 phase(N.4) Tasks 22+23 fixup: trigger WB mesh loads + correct SurfaceId source
Task 26 visual verification surfaced three bugs in the dispatcher.
Two are fixed here; the third is documented as a remaining issue.

1. WB's IncrementRefCount only bumps a usage counter — it does NOT
   trigger mesh loading. Fixed in WbMeshAdapter.IncrementRefCount:
   call PrepareMeshDataAsync(id, isSetup: false) on first registration.
   Result auto-enqueues to _stagedMeshData (line 510 of WB's
   ObjectMeshManager) which Tick() drains onto the GPU.

2. EntitySpawnAdapter never registered per-instance entity meshes
   with WB. LandblockSpawnAdapter only registers atlas-tier
   (ServerGuid == 0); per-instance entities fell through. Fixed by
   adding optional IWbMeshAdapter constructor param + tracking unique
   GfxObj ids per server-guid for IncrementRefCount on OnCreate /
   DecrementRefCount on OnRemove.

3. WbDrawDispatcher.ResolveTexture used batch.SurfaceId which WB
   never populates (line 1746 of ObjectMeshManager only sets
   batch.Key — the TextureKey struct that has SurfaceId). Switched
   to batch.Key.SurfaceId.

Plus diagnostic counters (ACDREAM_WB_DIAG=1) for entity-seen / drawn
/ mesh-missing / draws-issued counts.

Status: with these fixes the dispatcher now issues real draw calls
(~16K/frame, validated via diagnostic). However visual verification
shows characters appear "exploded" (parts spaced too far apart) and
scenery (trees/rocks/fences/buildings) does not appear. Root cause
analysis pending — Adjustment 7 in the plan documents the deferred
work. Flag stays default-off; legacy renderer remains the
production path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 15:50:21 +02:00
Erik
01cff4144f phase(N.4) Tasks 22+23: WbDrawDispatcher + surface metadata side-table
WbDrawDispatcher draws all entities through WB's ObjectRenderData
(VAO/VBO per GfxObj, per-batch IBO) using acdream's TextureCache for
texture resolution. Two-pass rendering (opaque+ClipMap, then
translucent) matching the existing InstancedMeshRenderer pattern.
Per-entity single-instance drawing for N.4 simplicity — true
instancing grouping deferred to N.6.

Atlas-tier entities: mesh from WB, texture from TextureCache via
batch SurfaceId. Per-instance-tier entities: AnimatedEntityState
drives part overrides + hidden-parts, palette/surface overrides
resolve through TextureCache's composite-key caches.

Side-table population (Task 23 folded in): WbMeshAdapter now takes
DatCollection and populates AcSurfaceMetadataTable on first
IncrementRefCount per GfxObj. The side-table provides TranslucencyKind
(critical for ClipMap alpha-test on vegetation) plus Luminosity,
Diffuse, SurfOpacity, NeedsUvRepeat, DisableFog for sky-pass and
lighting.

GameWindow wiring: when WbFoundationFlag is enabled, WbDrawDispatcher
draws everything and InstancedMeshRenderer is skipped. Flag-off path
is unchanged.

Matrix composition: restPose * animOverride * entityWorld, matching
the spec. Three MatrixCompositionTests verify the contract.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 15:30:33 +02:00