acdream/docs/superpowers/specs/2026-05-08-phase-n4-rendering-foundation-design.md
Erik 9bb6b254dc spec(N.4): rendering pipeline foundation design
Adopting WB's ObjectMeshManager + TextureAtlasManager as acdream's
shared rendering infrastructure. Two-tier split: atlas for shared
procedural content (terrain props, scenery, buildings), per-instance
path for server-spawned customized entities (characters, creatures,
equipped items).

Animation handled by composing per-frame override matrices from our
existing AnimationSequencer with cached rest poses at draw time.
Cache stays valid; AnimationSequencer untouched.

Streaming-loader integration: ~200 LOC adapter shim wires landblock
load/unload to IncrementRefCount/DecrementRefCount; pending-spawn
list mechanism preserved.

Surface metadata (Translucency/Luminosity/Diffuse/SurfOpacity/
NeedsUvRepeat/DisableFog) preserved via side-table keyed by
(GfxObjId, surfaceIdx) — no fork patches required.

Three algorithmic conformance tests run before substitution per the
N.1/N.3 pattern. Visual verification at 5 named locations.

3-4 weeks, single shippable phase. Foundation enables N.5-N.9.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-08 12:47:49 +02:00

23 KiB
Raw Permalink Blame History

Phase N.4 — Rendering Pipeline Foundation: Design

Date: 2026-05-08 Status: Design complete, awaiting plan generation. Parent design: 2026-05-08-phase-n-worldbuilder-migration-design.md Roadmap entry: docs/plans/2026-04-11-roadmap.md — Phase N.4 Inventory reference: docs/architecture/worldbuilder-inventory.md Related: ISSUE #51 — terrain split formula divergence (handled in N.5).

Goal

Adopt WB's Chorizite.OpenGLSDLBackend.Lib.ObjectMeshManager and TextureAtlasManager as acdream's rendering pipeline foundation. This is the integration that unblocks Phases N.5 (terrain), N.6 (static objects), N.7 (env cells), N.8 (sky/particles), and absorbs N.10 (GL infrastructure consolidation). N.4 ships no visible change — the world should look identical to today; what changes is the infrastructure behind the scenes.

Why

The roadmap's original "drop-in helper" framing was wrong for N.4. Discovery during brainstorm 2026-05-08: WB's ObjectMeshManager is not a stateless helper class like SceneryHelpers (N.1) or TextureHelpers (N.3). It is a 2070-line stateful asset pipeline that owns:

  • GPU resources per object (VAO/VBO/IBO via ObjectRenderData)
  • Reference counting (IncrementRefCount/DecrementRefCount)
  • LRU cache + memory budget (default 1 GB)
  • Background-thread CPU mesh preparation, main-thread GPU upload
  • Shared texture atlases keyed by (Width, Height, Format)
  • Particle emitter staging
  • Modern bindless rendering path on capable hardware

There is no clean "just the mesh extraction" entry point. WB's BuildPolygonIndices (the algorithm we already faithfully ported into GfxObjMesh.cs) is a private method tightly coupled to atlas batching. To use WB's tested infrastructure at all means adopting the whole pipeline.

N.5 + N.6 + N.7 build on this foundation. WB's TerrainRenderManager, StaticObjectRenderManager, and EnvCellRenderManager all consume ObjectMeshManager (or its atlas) as substrate. Without N.4, each later phase would need to either fork those render managers or duplicate the infrastructure. Doing N.4 now means N.5/N.6/N.7 become integration phases on top of shared plumbing, not parallel infrastructure builds.

Real benefits beyond infrastructure consolidation:

  1. Memory budget with LRU eviction (we don't have this; bigger stream radii currently risk OOM).
  2. Texture atlasing → ~4-8× fewer draw calls for static scenery (~1100 entities at Holtburg today).
  3. Background-thread mesh preparation — addresses the render-thread-stall problem from feedback_phase_a1_hotfix_saga.md that forced us to revert async streaming.
  4. Bindless textures on capable hardware (free perf when GL 4.3 + GL_ARB_bindless_texture are available).

Architecture

Two-tier rendering split

acdream's content cleanly partitions into two categories that map onto two rendering paths:

Tier Content Why this category Path
Atlas (shared) Terrain props, scenery (procedural — trees / rocks / bushes / fences from ~50 templates), buildings, slabs, dungeon static geometry Client-side procedural; no per-instance variation; many instances of few unique meshes WB's ObjectMeshManager + TextureAtlasManager. Big sharing wins (1100 entities ↦ ~50 atlas slots).
Per-instance (customized) Server-spawned entities (CreateObject): characters, creatures, equipped items. Anything carrying SubPalettes / TextureChanges / AnimPartChange / HiddenParts / GfxObjRemapping Always uniquely customized; few visible at a time (~10-50) Existing TextureCache.GetOrUploadWithPaletteOverride. Already hash-keys overrides for caching; already tested.

Routing rule:

  • Objects spawned by LandblockStreamLoader (procedural, no customization) → atlas tier.
  • Objects spawned by CreateObject (network, always customized) → per-instance tier.

The boundary mirrors a distinction that already exists in our networking model. We are not inventing a new conceptual line; we are matching one that's already there.

Animation handling

Core insight: in AC, animation is per-part TRANSFORM changes, not mesh changes. A creature's Setup is a list of rigid GfxObj parts (head, body, hands, etc.). Each part is its own static mesh; vertices inside each part never change. Animation moves the parts as rigid bodies.

This means mesh data is static even for animated entities — the cache works fine. Only the per-part transforms change per frame, and those don't live in the mesh cache.

Composition at draw time:

final_part_world_matrix
    = entity_world_transform
    × animation_override          (from AnimationSequencer, this frame)
    × rest_pose_transform         (cached in ObjectMeshData.SetupParts)
  • WB's ObjectMeshData.SetupParts: List<(ulong GfxObjId, Matrix4x4 Transform)> stores the rest-pose transforms (cached, shared).
  • Our existing AnimationSequencer is untouched. It continues to produce per-part override matrices per frame, driven by motion table + current motion command + tick.
  • The renderer composes the three matrices per part per draw and pushes the result as a uniform/instance attribute.

AnimPartChange (server swaps a part's GfxObj — e.g., wielding a sword): per-entity override map Dictionary<int partIndex, ulong gfxObjId>. At draw time, look up override; fall back to cached Setup part. WB's mesh manager caches the override GfxObj's mesh data the same way as any other part — first time seen, then shared.

HiddenParts (bitmask hiding parts): per-entity ulong bitmask. Draw loop: if (hiddenMask & (1 << partIndex)) continue;.

Per-frame CPU cost: ~50 visible animated entities × ~20 parts = ~1000 matrix multiplies per frame. Sub-millisecond on any CPU.

GPU-side per-draw transform push: start with uniform-per-draw (simple, ~1000 draws/frame for animated entities — fine). Promote to per-instance vertex attribute (instanced draw, ~50 draws/frame) only if measured perf demands it.

Streaming loader integration

Adapter shim, ~200 LOC, sits between LandblockStreamLoader / WorldSession and ObjectMeshManager:

Source event Adapter call What ObjectMeshManager does
Landblock loaded by streaming IncrementRefCount(id) per unique GfxObj/Setup id in Setups[] + Statics[] Begins CPU prep on background worker if not cached; queues GPU upload on main thread
Landblock unloaded by streaming (radius hysteresis) DecrementRefCount(id) per object Drops to LRU when count reaches 0; LRU + 1 GB memory budget handles eviction
Network CreateObject Per-instance path: build PaletteOverride from SubPalettes, decode through TextureCache.GetOrUploadWithPaletteOverride, register entity-local mesh data Bypasses WB atlas; stays in our existing per-instance path
Network RemoveObject Release per-instance state for entity (no WB call)

Pending-spawn list preservation: the streaming loader's existing pending-spawn list mechanism stays in place. CreateObject arriving before its landblock streams in still parks until the landblock arrives, then drains. The adapter is invoked when the spawn drains, not when it parks.

Thread safety: WB's ObjectMeshManager uses ConcurrentDictionary for its internal state and is designed to take IncrementRefCount calls from any thread. Our streaming worker can call it directly without marshaling onto the render thread. (This is part of why WB's design addresses the render-thread-stall problem.)

Surface metadata strategy

Side-table, not fork patch.

WB's MeshBatchData carries IsTransparent + IsAdditive. We need to preserve these acdream-specific surface properties already present in our GfxObjMesh.cs:

  • Translucency (TranslucencyKind enum: Opaque / AlphaBlend / Additive)
  • Luminosity (float, self-illumination coefficient — sky pass critical)
  • Diffuse (float)
  • SurfOpacity (float, derived from Surface.Translucency)
  • NeedsUvRepeat (bool, derived from authored UV range — sky-pass wrap-mode selection)
  • DisableFog (bool, derived from emissive surface flags — sky-pass fog skip)

Our renderer integration maintains a side-table: Dictionary<(ulong gfxObjId, int surfaceIdx), AcSurfaceMetadata>. The key matches the shape of today's GfxObjSubMesh — a (GfxObj, surface index) pair uniquely identifies a per-surface render batch. Stable across IncrementRefCount cycles. The metadata is computed once at mesh-extraction time (matching today's GfxObjMesh.Build) and looked up at draw time.

Why side-table not fork patch:

  • Keeps WB's types pristine; upstream merges stay clean.
  • Lookup cost is negligible (one hash lookup per batch per frame).
  • Easy to roll back if WB's design evolves to incorporate similar fields.
  • Preserves the careful sky-pass work done in C.1 with no risk to sky rendering during this migration.

Fork hygiene

Target: zero fork patches for N.4. WB's acdream branch stays at upstream master plus the editor-only file deletions inherited from N.0/N.1. If a fork patch becomes genuinely necessary mid-implementation (e.g., a public hook is missing for our customization layer), it lands as a single named patch with a comment explaining the rationale. Each patch is candidate to upstream back to Chorizite/WorldBuilder.

Components

New code (acdream-side)

File Responsibility
src/AcDream.App/Rendering/Wb/WbMeshAdapter.cs Bridges acdream's lifecycle events to ObjectMeshManager. Holds the ObjectMeshManager instance, exposes IncrementRefCount / DecrementRefCount / GetRenderData to the rest of the renderer.
src/AcDream.App/Rendering/Wb/LandblockSpawnAdapter.cs Streaming-loader hook. Walks LandblockEntry.Setups[] + Statics[], calls WbMeshAdapter with unique ids. Companion LandblockUnloadAdapter for unload events.
src/AcDream.App/Rendering/Wb/EntitySpawnAdapter.cs Network-spawn hook. Routes CreateObject to per-instance path, RemoveObject to release.
src/AcDream.App/Rendering/Wb/AcSurfaceMetadata.cs Side-table type holding Translucency / Luminosity / Diffuse / SurfOpacity / NeedsUvRepeat / DisableFog.
src/AcDream.App/Rendering/Wb/AcSurfaceMetadataTable.cs The Dictionary<batchKey, AcSurfaceMetadata> side-table, populated at mesh-extraction time, queried at draw time.
src/AcDream.App/Rendering/Wb/AnimatedEntityState.cs Per-entity render state for animated entities: partGfxObjOverrides map (AnimPartChange), hiddenMask (HiddenParts), reference to AnimationSequencer for per-frame override matrices.
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs Per-frame draw loop. For each visible entity, looks up ObjectRenderData, composes per-part matrices (entity × animation × rest-pose), reads side-table metadata, issues GL draw.

Modified code (acdream-side)

File Change
src/AcDream.App/Rendering/StaticMeshRenderer.cs Replace internal mesh-data + GL-resource handling with calls into WbMeshAdapter. Public surface preserved for the rest of the renderer's call sites. N.6 will fully replace this file; N.4 leaves it in place as a thin adapter.
src/AcDream.App/Rendering/InstancedMeshRenderer.cs Same pattern — internal swap, public surface preserved. N.6 fully replaces this file.
src/AcDream.App/Rendering/TextureCache.cs Per-instance path stays. Atlas-tier callers (anything using GetOrUpload(surfaceId) for static content) route through WbMeshAdapter instead. The override paths (GetOrUploadWithOrigTextureOverride, GetOrUploadWithPaletteOverride) keep their current behavior.
src/AcDream.App/Rendering/GpuWorldState.cs Spawn/despawn callbacks route through WbMeshAdapter. Pending-spawn list mechanism preserved verbatim.
src/AcDream.App/Rendering/GameWindow.cs Construct WbMeshAdapter on init; dispose on shutdown.
src/AcDream.Core/Meshing/SetupMesh.cs Kept for tests + as the conformance-test reference implementation. Production callers route through WbMeshAdapter.
src/AcDream.Core/Meshing/GfxObjMesh.cs Kept for tests + conformance reference. Production callers route through WbMeshAdapter.

Data flow

Spawn — landblock-streamed (atlas tier)

LandblockStreamLoader.Load(landblockId)
  → LandblockEntry { Setups, Statics, ... }
  → LandblockSpawnAdapter.OnLoaded(entry)
    for each unique gfxObjId in (entry.Setups  entry.Statics):
      WbMeshAdapter.IncrementRefCount(gfxObjId)
        → ObjectMeshManager.IncrementRefCount(gfxObjId)
          → if not cached: queue background prep
          → on prep complete: queue main-thread upload
          → on upload: GL VAO/VBO/IBO ready

Spawn — network-customized (per-instance tier)

WorldSession.OnCreateObject(msg)
  → EntitySpawnAdapter.OnCreate(entity)
    → build PaletteOverride from msg.SubPalettes
    → for each surface needing per-instance decode:
        TextureCache.GetOrUploadWithPaletteOverride(...)
    → register AnimatedEntityState (override map, hidden mask,
      animation sequencer reference)

Per-frame draw (atlas tier)

WbDrawDispatcher.Draw()
  for each visible atlas-tier entity:
    var renderData = WbMeshAdapter.GetRenderData(entity.GfxObjId)
    foreach (batch in renderData.Batches):
      bind atlas, bind shader, push uniforms
      foreach (part in renderData.SetupParts):
        push final_part_world_matrix uniform
        glDrawElements(part.indices)

Per-frame draw (per-instance tier, animated)

WbDrawDispatcher.DrawAnimated()
  for each visible animated entity:
    var state = entity.AnimatedEntityState
    var sequencer = entity.AnimationSequencer
    sequencer.AdvanceTo(currentTime)  // existing
    var animOverrides = sequencer.GetCurrentPartTransforms()  // existing

    foreach (partIdx in 0..parts.Count):
      if (state.hiddenMask & (1 << partIdx)) continue;
      var gfxObjId = state.partGfxObjOverrides.GetValueOrDefault(partIdx) ?? defaultParts[partIdx]
      var renderData = WbMeshAdapter.GetRenderData(gfxObjId)
      var meta = AcSurfaceMetadataTable.Lookup(renderData.BatchKey)
      var worldMatrix = entityWorld × animOverrides[partIdx] × renderData.RestPose
      bind per-instance texture (TextureCache lookup)
      push uniforms (worldMatrix, meta.Luminosity, meta.Diffuse, ...)
      glDrawElements(...)

Testing

Algorithmic conformance (before substitution)

Per the N.1 / N.3 pattern, conformance tests run BEFORE the substitution to prove equivalence:

Test Compares
MeshExtraction_OurBuildVsWbBuildPolygonIndices Battery of fixture GfxObjs (varying polygon counts, stippling flags, NegUVIndices, double-sided polys). For each: our GfxObjMesh.Build output vs WB's ObjectMeshManager output (extracted via test harness). Assert: identical vertex arrays, identical index arrays, identical per-bucket surface mapping.
SetupFlattening_OurFlattenVsWbSetupParts Battery of representative Setups (flat / hierarchical / Resting-frame / Default-frame / no-frame). For each: our SetupMesh.Flatten output vs WB's Setup-parts walk. Assert: identical (GfxObjId, Matrix4x4) sequences.
PerInstanceDecode_OldVsNewPath Synthetic palette + texture overrides (mirroring real CreateObject data). Decoded through new integrated path vs current TextureCache.GetOrUploadWithPaletteOverride. Assert: identical RGBA8.

If any test fails it's a real divergence — investigate, do not "fix" the test (per N.3 watchout).

Component micro-tests

Test Covers
LandblockSpawnAdapter_RegistersAndUnregisters Mock ObjectMeshManager; verify ref-count increments/decrements pair correctly across landblock load/unload events.
LandblockSpawnAdapter_DedupesSharedIds Same GfxObj id appearing in multiple landblocks: verify single ref-count per landblock, not per occurrence.
EntitySpawnAdapter_RoutesToPerInstance CreateObject with SubPalettes set: verify per-instance path taken, atlas tier not invoked.
AnimPartChange_OverridesAtDraw Per-instance override map: verify draw loop resolves correct part GfxObj id when override present, falls back to Setup default when absent.
HiddenParts_SuppressesDraw Bitmask: verify draw loop skips hidden parts.
MatrixComposition_EntityAnimRest Known entity transform + animation matrix + rest pose: verify final world matrix matches expected composition order (column-major: rest applied first, then animation, then entity world).
SurfaceMetadata_SideTableLookup Populate side-table during mesh extraction; query at draw time; verify Luminosity / Diffuse / DisableFog round-trip correctly.

Visual verification (per phase, before flipping Live ✓)

Walk the following with the user, comparing against pre-N.4 screenshots or video:

  1. Holtburg outdoor — terrain props, scenery, buildings, NPCs, characters. Verify: no missing entities, no magenta squares, no alpha bleeding, no shading regressions, no animation hitches.
  2. Drudge Hideout (or comparable starter dungeon) — EnvCell geometry, interior lighting, animated creatures.
  3. Foundry — heavy NPC traffic, customized appearances (the server's first-time test bed for per-instance customization correctness).
  4. A character with extreme palette overrides — char-creation variant if available, otherwise a known-customized server-side test character.
  5. Long roam — walk for ~5 minutes across multiple landblocks, monitor GPU memory in title bar (memory budget enforcement working means it stabilizes; memory growing unboundedly means LRU eviction isn't firing).

Phasing

Single shippable phase — no internal sub-phases. Within the phase, work ordered to minimize the duration of "broken in middle" state:

Week Focus "Done when"
1 WB integration plumbing + atlas bring-up for static scenery only (smallest tier, highest sharing factor) + algorithmic conformance tests pass Conformance tests green; static scenery renders through ObjectMeshManager while everything else uses old path
2 Streaming-loader adapter; LRU + memory budget verified under streaming pressure (long roam + radius 7×7) Long roam holds steady GPU memory; landblock unload reclaims memory
3 Per-instance customization path; animated creatures with palette overrides; AnimPartChange + HiddenParts Drudge / chicken / banderling render with correct customizations; animation matches today
4 Surface metadata side-table integration; sky-pass preservation; visual verification at named locations; polish Visual verification at all 5 locations passes; sky pass renders identically; ready for Live ✓

Risks

  1. Per-instance customization scope creep. If we discover a customization path we don't already handle in TextureCache (e.g., a rare GfxObjRemapping case), the per-instance path may need extension. Mitigation: enumerate all customization paths during week 3, add tests for each before integrating.

  2. WB threading model interaction with our streaming worker. ObjectMeshManager uses ConcurrentDictionary and is designed for concurrent IncrementRefCount calls, but its _pendingRequests queue is guarded by a lock. Heavy concurrent landblock loads could serialize on this lock. Mitigation: profile during week 2; if contention is visible, batch landblock loads to amortize the lock.

  3. Sky pass regression. The sky pass's NeedsUvRepeat / DisableFog / Luminosity flow is fragile and load-bearing. The side-table preserves the data, but the integration point with SkyRenderer needs careful review. Mitigation: sky-pass-specific visual verification before flipping Live ✓.

  4. Bindless rendering path mismatch. WB enables bindless when GL 4.3 + GL_ARB_bindless_texture are present. If we ship through the bindless path and a player has older hardware, fallback path must work. Mitigation: dev/test with _useModernRendering = false forced during week 1 to ensure the non-bindless path is also exercised.

  5. Performance regression during integration of week 1's "atlas for static scenery, old path for everything else" mixed state. Mitigation: keep the feature gate ACDREAM_USE_WB_FOUNDATION=1 during weeks 1-3; default-off until week 4 visual verification.

Out of scope

  • Replacing StaticMeshRenderer / InstancedMeshRenderer — those become thin adapters in N.4 and are fully replaced in N.6.
  • Replacing TerrainAtlas / TerrainBlending — that's N.5.
  • Replacing EnvCell rendering — that's N.7.
  • Replacing sky / particle rendering — that's N.8.
  • Replacing visibility / culling — that's N.9.
  • Per-instance customization beyond what's in today's TextureCache (e.g., novel customization opcodes from future Phase F work) — out of scope; future opcodes route through the same per-instance path.

Documentation impact

  • Roadmap — N.4 entry rebranded and N.5/N.6/N.7/N.8/N.9/N.10 estimates revised (committed 6d42744 and merged to main).
  • This spec — written 2026-05-08, committing alongside.
  • worldbuilder-inventory.md — minor update at end of N.4 to mark ObjectMeshManager / TextureAtlasManager as "now wired up" rather than just "should use." Not blocking N.4 start.
  • acdream-architecture.md — needs an acknowledging note after N.4 lands that the rendering pipeline is WB-backed. Can follow in a later commit.

Reference materials

  • WB ObjectMeshManager: references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/ObjectMeshManager.cs
  • WB TextureAtlasManager: references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TextureAtlasManager.cs
  • WB BaseObjectRenderManager: references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/BaseObjectRenderManager.cs
  • ACME secondary oracle for character appearance (CreaturePalette / GfxObjRemapping / HiddenParts behavior): references/WorldBuilder-ACME-Edition/WorldBuilder/Editors/Landscape/StaticObjectManager.cs
  • Existing acdream code: