Final cross-cutting review of N.5 found that Task 15's deletion of
mesh_instanced.vert/.frag left InstancedMeshRenderer orphaned —
ACDREAM_USE_WB_FOUNDATION=0 silently rendered terrain+sky only with
no entities. The SHIP commit's "[x] ACDREAM_USE_WB_FOUNDATION=0 still
works" claim was inaccurate.
Resolution: formal retirement of the legacy renderer path within N.5
instead of deferring to N.6.
Deleted:
- src/AcDream.App/Rendering/InstancedMeshRenderer.cs
- src/AcDream.App/Rendering/StaticMeshRenderer.cs
- src/AcDream.App/Rendering/Wb/WbFoundationFlag.cs
GameWindow simplified — capability detection is unconditional, missing
bindless throws NotSupportedException with a clear message at startup.
WbDrawDispatcher + mesh_modern shader load are mandatory after init.
No escape hatch.
GpuWorldState simplified — WbFoundationFlag.IsEnabled guards on
AddLandblock/RemoveLandblock removed; adapter calls are unconditional
when the adapter is non-null.
PendingSpawnIntegrationTests updated — WbFoundationFlag.ForTestsOnly_ForceEnable
static ctor removed (flag is gone; adapter calls are unconditional).
The ApplyLoadedTerrain physics-data loop was also simplified: the
EnsureUploaded sub-loop that fed InstancedMeshRenderer is gone;
_pendingCellMeshes is now explicitly cleared to prevent unbounded
accumulation (the worker thread still populates it, but WB handles
EnvCell geometry through its own pipeline).
Spec §2 Decision 5 + §10 Out-of-Scope updated. Plan ship-amendment
section added. Roadmap updated (N.5 ships with retirement; N.6 scope
narrowed to perf-only). CLAUDE.md "WB integration cribs" updated.
Perf baseline doc updated. WbDrawDispatcher class summary docstring
corrected to describe the as-shipped SSBO + multi-draw-indirect path.
ISSUES.md #51 updated (terrain not in N.5 scope; deferred to N.7).
Bindless support is now a hard requirement. Modern desktop GPUs
universally expose GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters;
if a user hits the NotSupportedException, that's a real bug report
worth investigating, not a silent fallback.
Build: 0 errors, 0 warnings. Tests: 71/71 (Wb+MatrixComposition+TextureCacheBindless filter).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds median + 95th-percentile CPU + GPU dispatch time to the existing
5-second [WB-DIAG] rollup. CPU via Stopwatch (always running, cheap;
only logged under ACDREAM_WB_DIAG=1). GPU via two GL_TIME_ELAPSED
queries (opaque + transparent) wrapping each glMultiDrawElementsIndirect,
polled non-blocking via QueryResultAvailable on the next frame.
Sample window is 256 frames per signal; median + p95 reported.
Numbers populate the SHIP commit's perf table at Task 19.
Silk.NET naming note: GL_TIME_ELAPSED queries use QueryTarget.TimeElapsed
(confirmed present in Silk.NET.OpenGL 2.23.0 DLL). The 64-bit result is
read via GetQueryObject(..., out ulong) which dispatches to
glGetQueryObjectui64v; the int overload (glGetQueryObjectiv) is used for
the ResultAvailable poll, matching WorldBuilder's VisibilityManager pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Locks in Decision 2 (Opaque + ClipMap → opaque indirect; AlphaBlend +
Additive + InvAlpha → transparent indirect). Catches future refactors
that drift the partition — silent visual regression otherwise (groups
rendered in the wrong pass with the wrong blend state).
Adds public static IsOpaquePublic shim on WbDrawDispatcher; the
underlying IsOpaque stays private.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces WbDrawDispatcher's per-group glDrawElementsInstancedBaseVertexBaseInstance
loop with two glMultiDrawElementsIndirect calls (opaque + transparent).
Per-frame uploads three SSBOs:
- _instanceSsbo @ binding=0 (mat4 per instance, indexed by gl_BaseInstanceARB + gl_InstanceID)
- _batchSsbo @ binding=1 (BatchData per group, indexed by gl_DrawIDARB)
- _indirectBuffer (DrawElementsIndirectCommand[] — opaque first, transparent second)
GameWindow swaps the shader load to mesh_modern when _bindlessSupport
is non-null. Capability detection + shader load now run in the right
order (capability before TextureCache + before Shader).
Deletes the obsolete DrawGroup stub, EnsureInstanceAttribs, _instanceBuffer,
_patchedVaos. ClassifyBatches + ResolveTexture already migrated in
Task 8 to use ulong bindless handles.
BuildIndirectArrays (Task 9) wired in: _opaqueDraws + _translucentDraws
are flattened into IndirectGroupInput[], laid out via the helper into
contiguous indirect commands + parallel BatchData[]. opaqueByteOffset=0,
transparentByteOffset = opaqueCount × DrawCommandStride.
Visual verification (USER GATE) PASS: Holtburg courtyard renders
identical to N.4 — terrain, scenery, characters, NPCs all visible
without artifacts. [N.5] modern path capabilities present + mesh_modern
shader loaded log lines confirm the boot path. [WB-DIAG] hot-path
counters show healthy entity/draw activity.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code quality review caught:
- sizeofDEIC was a local; promoted to public const DrawCommandStride
so tests can reference it symbolically.
- BatchDataPublic layout invariant (size + field offsets) wasn't
asserted in tests. Added BatchDataPublic_LayoutMatchesPrivateBatchData
+ DrawCommandStride_MatchesStructSize tests to gate Task 10's
MemoryMarshal.Cast<BatchData, BatchDataPublic> safety.
- Plan doc updated: BatchDataPublic spec was Pack=4 (wrong — must
match private BatchData's Pack=8 for the cast to work). Implementation
was already correct; plan now matches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure CPU helper that lays out a group list into a contiguous indirect
buffer (DrawElementsIndirectCommand[]) and parallel BatchData[] —
opaque section first, transparent section second. Returns counts +
byte offset for the transparent section.
Tests cover: spec §5 walk-through layout; empty group list edge case;
ClipMap classification (treated as opaque, not transparent).
Static + public so tests can exercise without a GL context. Task 10
wires it into the rewritten Draw() method.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces uint TextureHandle (32-bit GL name) with ulong
BindlessTextureHandle (64-bit) in InstanceGroup + GroupKey + ResolveTexture
return type. Adds TextureLayer (always 0 for per-instance composites,
becomes meaningful when WB atlas is adopted in N.6).
ClassifyBatches now calls TextureCache.GetOrUpload*Bindless variants —
these return Texture2DArray-backed bindless handles (Task 3 work).
DrawGroup body throws NotImplementedException — Task 10 rewrites the
whole Draw() method to use glMultiDrawElementsIndirect, which makes
DrawGroup obsolete. CPU-only tests don't invoke DrawGroup so the build
+ test gates stay green; visual launch fails until Task 10 (intentional).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code quality review caught that BatchData uses Pack=4 but contains a
ulong field. With the current field order (TextureHandle first), offset
0 is always 8-byte aligned so std430 works. But adding a 4-byte field
before TextureHandle without bumping Pack would silently misalign the
GPU struct. Pack=8 makes the alignment requirement explicit and adds
a comment documenting expected std430 offsets.
No runtime change — current offsets (0/8/12) are identical under both
Pack values for this field order.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds DrawElementsIndirectCommand struct (20-byte layout for
glMultiDrawElementsIndirect). Replaces _instanceVbo field on
WbDrawDispatcher with three buffers: _instanceSsbo (mat4[]),
_batchSsbo (BatchData[]), _indirectBuffer (DEIC[]). Adds BindlessSupport
constructor parameter — non-null required since the dispatcher is only
constructed when WB foundation is on (which implies bindless is present
per Task 6 capability detection).
Existing Draw() method substitutes _instanceVbo -> _instanceSsbo for
compile. Behavior is temporarily wrong (SSBO bound as ArrayBuffer for
per-vertex attribs); Tasks 9-10 fully rewrite the draw loop and the
per-frame uploads to use BindBufferBase + glMultiDrawElementsIndirect.
GameWindow construction site updated to add _bindlessSupport guard and
pass it as the new last argument to the constructor. Dispatcher is only
constructed when bindless is guaranteed present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small wins on top of the grouped-instanced refactor.
1. Drop unused animState lookup. Was a side-effect-free
_entitySpawnAdapter.GetState call per per-instance entity, made
redundant by the Issue #47 fix that trusts MeshRefs.
2. Front-to-back sort opaque groups. Squared distance from camera to
each group's first-instance translation; ascending sort. Lets the
GPU's depth test reject fragments behind closer geometry — real
win on dense scenes (Holtburg courtyard, Foundry interior).
3. Per-entity AABB frustum cull. 5m-radius AABB check per entity
before walking parts. Skips work for distant entities even when
their landblock is partially visible. Animated entities (other
characters, NPCs, monsters) bypass — they always need per-frame
work for animation regardless. Conservative radius covers typical
entity bounds; large outliers stay landblock-culled.
4. Memoize palette hash per entity. TextureCache.HashPaletteOverride
is now internal; new GetOrUploadWithPaletteOverride overload takes
a precomputed hash. The dispatcher computes it ONCE per entity and
reuses across every (part, batch) lookup, avoiding the per-batch
FNV-1a fold over SubPalettes. Trees / scenery without palette
overrides skip entirely (palHash stays 0).
Visual output unchanged; FPS up further, especially in dense scenes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three bugs surfaced and resolved during Task 26 visual verification.
1. **No-scenery + exploded characters**: WB's modern rendering path
(GL 4.3 + bindless) packs every mesh into a single global VAO/VBO/IBO
(GlobalMeshBuffer). Each batch references its slice via FirstIndex
(offset into IBO) + BaseVertex (offset into VBO). The dispatcher's
DrawElementsInstanced(indices=0) read offset 0 of the global IBO
for every entity — drawing the same first triangle from every
entity position. Switched to glDrawElementsInstancedBaseVertex(
BaseInstance) with the batch's offsets. Scenery + connected
characters now render correctly.
2. **Issue #47 character regression**: Adjustment 6 stored
AnimPartChanges on WorldEntity.PartOverrides using the raw
server-sent NewModelId (no degrade resolver applied). The
dispatcher's animState.ResolvePartGfxObj override path then
clobbered MeshRefs (which GameWindow's spawn code correctly
resolves to close-detail meshes via GfxObjDegradeResolver).
Result: humanoids drew low-detail (~14 verts/17 polys) base
meshes instead of close-detail (~32 verts/60 polys), losing
bicep / shoulder / back geometry. Fix: trust MeshRefs as the
source of truth and don't re-apply animState overrides at draw
time. AnimatedEntityState's overrides only matter for hot-swap
appearance updates (0xF625) which today rebuild MeshRefs anyway.
3. **Performance — sub-100 FPS on Holtburg**: per-entity
single-instance draws meant ~16K glDraw calls/frame plus a
64-byte glBufferSubData per call. Refactored to grouped
instanced rendering: bucket all (entity, batch) pairs by
GroupKey(Ibo, FirstIndex, BaseVertex, IndexCount, TextureHandle,
Translucency); upload all matrices in ONE BufferData call;
one glDrawElementsInstancedBaseVertexBaseInstance per group
with BaseInstance pointing at the group's slice in the shared
instance VBO. Down from ~16K to a few hundred draws/frame
(~30× fewer). Bind VAO once per frame (modern WB shares one
global VAO). Removed redundant per-draw VertexAttribPointer
(VAO captures that state).
Result: Holtburg renders correctly with characters showing full
detail; FPS climbed substantially. Two more bugs (mesh loading
+ batch.Key.SurfaceId) were fixed in the prior commit (943652d).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Task 26 visual verification surfaced three bugs in the dispatcher.
Two are fixed here; the third is documented as a remaining issue.
1. WB's IncrementRefCount only bumps a usage counter — it does NOT
trigger mesh loading. Fixed in WbMeshAdapter.IncrementRefCount:
call PrepareMeshDataAsync(id, isSetup: false) on first registration.
Result auto-enqueues to _stagedMeshData (line 510 of WB's
ObjectMeshManager) which Tick() drains onto the GPU.
2. EntitySpawnAdapter never registered per-instance entity meshes
with WB. LandblockSpawnAdapter only registers atlas-tier
(ServerGuid == 0); per-instance entities fell through. Fixed by
adding optional IWbMeshAdapter constructor param + tracking unique
GfxObj ids per server-guid for IncrementRefCount on OnCreate /
DecrementRefCount on OnRemove.
3. WbDrawDispatcher.ResolveTexture used batch.SurfaceId which WB
never populates (line 1746 of ObjectMeshManager only sets
batch.Key — the TextureKey struct that has SurfaceId). Switched
to batch.Key.SurfaceId.
Plus diagnostic counters (ACDREAM_WB_DIAG=1) for entity-seen / drawn
/ mesh-missing / draws-issued counts.
Status: with these fixes the dispatcher now issues real draw calls
(~16K/frame, validated via diagnostic). However visual verification
shows characters appear "exploded" (parts spaced too far apart) and
scenery (trees/rocks/fences/buildings) does not appear. Root cause
analysis pending — Adjustment 7 in the plan documents the deferred
work. Flag stays default-off; legacy renderer remains the
production path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WbDrawDispatcher draws all entities through WB's ObjectRenderData
(VAO/VBO per GfxObj, per-batch IBO) using acdream's TextureCache for
texture resolution. Two-pass rendering (opaque+ClipMap, then
translucent) matching the existing InstancedMeshRenderer pattern.
Per-entity single-instance drawing for N.4 simplicity — true
instancing grouping deferred to N.6.
Atlas-tier entities: mesh from WB, texture from TextureCache via
batch SurfaceId. Per-instance-tier entities: AnimatedEntityState
drives part overrides + hidden-parts, palette/surface overrides
resolve through TextureCache's composite-key caches.
Side-table population (Task 23 folded in): WbMeshAdapter now takes
DatCollection and populates AcSurfaceMetadataTable on first
IncrementRefCount per GfxObj. The side-table provides TranslucencyKind
(critical for ClipMap alpha-test on vegetation) plus Luminosity,
Diffuse, SurfOpacity, NeedsUvRepeat, DisableFog for sky-pass and
lighting.
GameWindow wiring: when WbFoundationFlag is enabled, WbDrawDispatcher
draws everything and InstancedMeshRenderer is skipped. Flag-off path
is unchanged.
Matrix composition: restPose * animOverride * entityWorld, matching
the spec. Three MatrixCompositionTests verify the contract.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>