acdream/docs/research/2026-06-11-building-render-holistic-port-handoff.md
Erik 9c45144047 docs: holistic building-render port charter + next-session prompt (the 2026-06-11 mandate)
User mandate: stop bug-by-bug; map acdream-vs-retail for building draw,
interiors, interior collision, dynamics, clipping, culling; plan the port
of retails drawing discipline once and for all. The handoff carries the
branch state (124c6cb, nothing on main), the full evidence inventory from
this session (orphan no-draw polys, door-vanish mystery, draw-side clip
status, straddle gate), the gap map, tooling (Ghidra MCP 8081 correct
PDB, live cdb protocol, dat dump + flood harnesses), the investigation
charter (workflow fan-out per subsystem, adversarial verification), and
the paste-ready new-session prompt. #113 marked REOPENED and folded in.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 21:50:23 +02:00

17 KiB
Raw Blame History

HANDOFF — Holistic building/interior render port (the end of whack-a-mole)

Date: 2026-06-11. Branch: claude/thirsty-goldberg-51bb9b (worktree), HEAD 124c6cb. Nothing is on main. All session work is branch-only, per the user's explicit instruction.

0. The mandate (user, 2026-06-11, verbatim intent)

"We can't go on like this. We need to solve this holistic once and for all. Not cottage by cottage or bug by bug. Check our code vs how retail draws buildings, interior, collision interior, dynamic objects, clipping and culling. I want one solution that works every time I walk to a new landblock and walk into a dungeon. […] map acdream's way vs retail, then make a plan how to port retail's way of doing it once and for all. We have come a long way and our code is worth saving."

This supersedes the per-issue grind (#105→#113 and the §4 residual family). The next session is an investigation + plan session: NO production code until the plan is approved. The deliverables are (1) an acdream-vs-retail architecture comparison document and (2) a phased port plan. Code is worth saving: this is a port-the-missing-architecture plan, not a rewrite.

1. Branch state (what's in, what's off, how to verify)

Commit What State
927fd8f #113 fix 1: GL clip distances enabled for the PView shell pass IN (scoped by 9ce335e)
414c3de #112 rider: retail straddle gate for outdoor-cell admission (membership pick) IN — solid, live-binary verified, conformance-pinned
8259598 docs: ISSUES updates IN
9ce335e #114 scope: shell clip outdoor-eye roots only (indoor regions not draw-quality) IN
6c9bbce docs: #114 + #115 filed IN
e46d3d9 #113 fix 2: DrawingBSP poly filter in GfxObj mesh extraction UN-APPLIED by 124c6cb (door regression); helper + tests remain
124c6cb revert of the filter application; keeps CollectDrawingBspPolygonIds + dat pins IN

Visible state when launched: doors work; the phantom staircase on the Holtburg meeting hall is VISIBLE again (known, documented — it is the un-filtered no-draw geometry, see §2.3); indoor rendering is the pre-#113 state (unclipped); outdoor interior-cell rendering is clipped to door apertures (the only part of the clip work that was validated).

Test baseline: Core 1392 green + 4 pre-existing #99-era failures (DoorBugTrajectoryReplay ×2, DoorCollisionApparatus, BSPStepUp) + 1 skip; App 226; UI 420; Net 294. Gates: P1 membership goldens, CornerFloodReplayTests, Issue107SpawnDiagnosticTests, Issue112MembershipTests (one renamed: ...DemotesRetailFaithfully; two new straddle-gate pins), Issue113 dump + filter pins.

2. What this session PROVED (evidence inventory — all reusable)

2.1 The retail draw-side portal clip exists and we half-have it

  • Retail clips drawn cell geometry to the accumulated portal view: Render::set_view (pc:343750) installs the view polygon's edge planes; DrawEnvCell submits every cell polygon with planeMask=0xffffffff (pc:427922) through ACRender::polyClipFinish. Characters/meshes are NOT poly-clipped (viewcone-check path, Render::viewconeCheck + BoundingType handling in DrawMeshInternal).
  • Our equivalent (UseShellClipRoutingmesh_modern.vert gl_ClipDistance, region SSBO binding=2 + per-instance slot binding=3) was routed but inert since birth (1405dd8) — GL_CLIP_DISTANCEi was never enabled for the shell pass. 927fd8f enabled it; the outdoor case is validated (Issue113MeetingHallFloodTests: per-cell regions are tight 46-plane door-aperture boxes across a 21-step eye sweep, 0 unclipped fallbacks).
  • Indoor clip regions are admission-quality, NOT draw-quality (#114): enabling the clip indoors chopped real geometry (hall interior stairs, candle-holder area, walls vanishing at exits; user screenshots in the session transcript 2026-06-11). Suspects: knife-edge regions when the eye nears a portal plane; MergeBuildingFrame first-view-wins dropping additional apertures for multi-portal cells; the >8-plane slot-0 fallback drawing pass-all (the assembler's scissor fallback contract was never implemented — ClipFrameAssembler.cs:13-15).

2.2 The membership outdoor-admission gate is now retail-exact

414c3de (KEEP THIS — it is correct and orthogonal to the render mess): CEnvCell::find_transit_cells (live binary 0052c820, x87 decoded at 0052c8e5-0052c92d; Ghidra decompile confirms char-for-char) admits outdoor cells IFF a path sphere STRADDLES an exterior portal plane (|dist| < radius + F_EPSILON(0.0002)). The membership PICK now gates on it; the collision cell SET keeps the A6.P5 topology widening until A6.P4. Conformance-pinned on real dat geometry.

2.3 The REAL phantom staircase: no-draw (drawing-BSP-orphaned) polygons

Dat-proven (Issue113PhantomStairsDumpTests.DumpHallModel_PolyFlagHistogram):

  • Retail renders a GfxObj by traversing its DrawingBSP; polygons present in the Polygons dictionary but referenced by no DrawingBSP node are never drawn.
  • Holtburg meeting hall 0x010014C3: dictionary polys {0,1} are a ~5×11 m stair-ramp spanning local z 0→8.5 — in the PhysicsBSP (ACE walks The Sentry on it at z 117118; invisible-but-walkable in retail) but orphaned from the draw tree — at every degrade level (LOD theory dead: Degrades[0] IS the base model).
  • Hill cottage 0x01000827: orphans {0..7}.
  • Our ObjectMeshManager.PrepareGfxObjMeshData iterates the dictionary → draws the collision skeleton: the wall staircase up close, the "flying stairs" over the cottage roofline from afar (orphan ramp spans world (221232, 104..109, z 116124.5)).
  • The naive filter broke doors (e46d3d9 → un-applied 124c6cb): filtering the dictionary to the PosNode/NegNode-walked id set made doors vanish across Holtburg. OPEN MYSTERY #1 (first diagnostic of the new session): run the histogram fact on a door GfxObj — either DatReaderWriter's DrawingBSPNode exposes polys some other way for those trees (portal-type nodes? Portals list? leaf indexing?), or the parse is incomplete, or doors' visible polys genuinely live outside node.Polygons. The correct retail shape is BSP-traversal-order drawing, not dictionary-with-filter — the filter was a minimal approximation.

2.4 Earlier session facts that still stand

  • A9B3 has ONE building (the hill cottage); the #112 gap is a real 20 cm doorway micro-gap; all 17 cottage cells share one identical Position — "misplaced interior cell" is refuted. The phantom-stairs building is the AAB3 meeting hall.
  • Door alignment between independently-placed interior cells and the rotated shell is exact (rotation conventions are right); the inn's server-placed door aligning with our shell is the world-anchored confirmation.
  • ACE patrol z-values are DB-authored, not physics-grounded — weak as a geometry oracle.
  • [resolve] lines in our logs show our client grounding the elevated Sentry at terrain z=116 vs the server's 117.2 — remote-entity grounding vs invisible-walkable geometry (#41-family data point).

3. The architecture gap map (what the investigation must complete)

What we believe we know, per area — every row needs decomp-verified confirmation and a "theirs vs ours" write-up. 🟢 = we have a verified port, 🟡 = partial/approximation, 🔴 = missing/unmapped.

Area Retail mechanism (anchors) acdream today Status
Object draw (GfxObj) DrawingBSP traversal (D3DPolyRender::ConstructMesh 0x0059dfa0 noted in code; CPhysicsPart::Draw); per-poly planeMask clip; degrade table per part (GfxObjDegradeResolver doc) Flattened dictionary mesh into global VAO + MDI; degrades only for humanoid setups (#47); no-draw orphans rendered 🔴 core divergence
Building shells CBuildingObj with leaf_cells (CPartCell) + portals; drawn via LScape → DrawSortCell → DrawBuilding; shell parts drawn per portal-view slot with set_view clip + viewconeCheck (pc:429282-429295) One WorldEntity per BuildInfo, whole-model mesh, frustum cull only 🔴
Interior cells (render) PView::DrawInside flood (ConstructView/ClipPortals/InitCell, pc:432896-433895) + per-cell portal_view slices + per-poly clip to the view Flood ported (R-A1/A2/A2b + dac8f6a, conformance-gated); draw-side clip outdoor-only; indoor regions not draw-quality (#114) 🟡
Statics in cells Drawn with their cell, per portal-view slot, viewcone-checked (CEnvCell::draw → object lists) Per-cell buckets via flood (DrawCellObjectLists), unclipped, separate particle gating (particles not flood-gated the same way → flames through walls) 🟡
Dynamic objects (doors, NPCs, items) CPhysicsObj part arrays; portal-view visibility checks; never hard poly-clipped WbDrawDispatcher MDI, frustum + visibleCellIds routing 🟡 (audit needed)
Culling viewer-cell rooted; portal-clipped BFS is THE cull indoors; outdoor = LScape draw + building portal floods; viewconeCheck per mesh Option A DrawInside is in; outdoor per-building floods (48 m seed); cell-particle scissor partial 🟡
Interior collision Per-cell shadow_object_list, portal-aware registration (add_shadows_to_cells) Landblock-wide ShadowObjectRegistry + b3ce505 gate (workaround) → #99 🔴 (= A6.P4, already designed)
Degrade/LOD selection Per-part current GfxObj chosen from degrade table by distance/quality Base model everywhere except humanoid setups 🔴
Cell-struct no-draw polys Same drawing-BSP rule presumably applies to CellStruct (cells have their own DrawingBSP) Dictionary iteration (site 2, ObjectMeshManager ~:1343) 🔴 unverified

The unifying theme: retail has ONE drawing discipline — BSP/portal-driven traversal decides what is drawn, and the portal view clips what survives — applied uniformly to terrain peeks, shells, cells, statics, and meshes. We replaced traversal with flattened-mesh iteration + a separately-bolted visibility filter, and every bug in the #105→#113 family is a place where the two disagree.

4. Open mysteries (carry into the investigation)

  1. Door vanish under the BSP filter (§2.3) — first diagnostic, 15 min with the existing dump harness. Identify a door GfxObj id via ACE weenie data or by clicking a door and reading [B.7] pick-info ... setup= from the log.
  2. Indoor clip-region draw-quality (#114) — knife-edge/multi-view/8-plane-fallback.
  3. Entry transparency at the hilltop cottage (user: still intermittent) — may be render (flood at entry) or membership at the threshold; needs a probe capture.
  4. Particles visible through walls — particle pass is not gated/clipped like meshes.
  5. Camera drag/jitter in cramped interiors (#115) — retail boom smoothing (SmartBox::update_viewer region) unread.

5. Tooling inventory (everything the investigation needs is live)

  • Ghidra MCP, port 8081, CORRECT program (patchmem 2013 v11.4186 + full PDB — verified: find_transit_cells @ 0052c820 matches). Endpoints: /decompile_function?address=0x..., /searchFunctions?query=..., /function_xrefs?name=..., /list_functions. This is the best decomp source — Ghidra renders x87 correctly where BN pseudo-C invents branches (proven twice).
  • Named pseudo-C: docs/research/named-retail/acclient_2013_pseudo_c.txt + acclient.h (verbatim structs) + symbols.json.
  • Live cdb attach (read-only disassembly protocol proven this session; static cdb -z uf mis-decodes at OMAP boundaries — use live attach): see CLAUDE.md "Retail debugger toolchain"; the user will launch the 2013 client on request.
  • Dat dump harness: tests/AcDream.Core.Tests/Conformance/Issue113PhantomStairsDumpTests.cs — buildings, cells, statics, portal planes, poly-flag histograms, DrawingBSP orphan diff, degrade chains, top-down ASCII maps. Extend freely.
  • Flood replay harnesses: CornerFloodReplayTests (indoor), Issue113MeetingHallFloodTests (outdoor per-building flood + assembler).
  • Key decomp anchors already mined: PView::InitCell :432896 / ClipPortals :433572 / AddViewToPortals :433446 / ConstructView :433750+:433827 / DrawInside :433793 / DrawPortal :433895; Render::set_view :343750; DrawEnvCell poly submit :427922; DrawBuilding :429282; CEnvCell::find_transit_cells 0052c820; CBuildingObj/CBldPortal/CCellPortal/portal_view_type structs in acclient.h :31908/:32094/:32300/:32346.

6. Investigation charter (for the new session)

Phase A — Map (workflows, parallel subagents). One mapping agent per area in §3's table, each producing "RETAIL: call-chain + data structures + exact gates (verbatim decomp lines)" vs "ACDREAM: call-chain + data structures (file:line)" vs "DIVERGENCES: ranked by user-visible blast radius". Adversarially verify each claimed divergence (the BN-invents-branches lesson; the A6.P5 caller/callee conflation lesson). Areas:

  1. GfxObj draw path (BSP traversal, no-draw, degrades, planeMask clip)
  2. Building shells (CBuildingObj, leaf_cells, per-portal-view drawing)
  3. Interior cells (PView slices → draw-side clip; what makes regions pixel-exact)
  4. Statics + dynamics in cells (object lists, viewconeCheck, particles)
  5. Culling end-to-end (LScape outdoor walk → building floods → indoor BFS)
  6. Interior collision (per-cell shadow lists — fold in the existing A6.P4 design)

Phase B — Plan. One phased port plan with: an invariant ("one drawing discipline"), per-phase acceptance criteria (conformance tests + which user-visible bugs each phase closes: phantom geometry class, #114, #108, #109, doors, #99, particles-through-walls), explicit keep-list (the flood port, the straddle gate, membership, streaming, bindless MDI — the code worth saving), and a migration order that keeps the client playable between phases. The plan goes to the user for approval BEFORE any production code.

Ground rules: investigation-first (no production edits); every retail claim needs a decomp citation or live-binary proof; every acdream claim needs file:line; tests can codify bugs — verify what call sites actually pass.

7. Paste-ready prompt for the new session

Pick up acdream as a SENIOR 3D ENGINE DEVELOPER for the HOLISTIC BUILDING-RENDER
INVESTIGATION (mandated by the user 2026-06-11: "solve this holistic once and for
all... map acdream's way vs retail, then make a plan how to port retail's way" —
no more cottage-by-cottage fixes). Worktree branch claude/thirsty-goldberg-51bb9b,
HEAD 124c6cb. NOTHING goes to main; no production code this session — the
deliverables are (1) an acdream-vs-retail architecture comparison and (2) a phased
port plan for user approval.

READ FIRST (in order):
1. docs/research/2026-06-11-building-render-holistic-port-handoff.md  ← THE charter:
   branch state, the full evidence inventory (orphan no-draw polys, the door-vanish
   mystery, draw-side clip status, straddle gate), the gap map (§3), the open
   mysteries (§4), tooling (§5), and the investigation phases (§6).
2. Memory digests: project_render_pipeline_digest + project_physics_collision_digest
   (DO-NOT-RETRY tables apply).
3. docs/architecture/worldbuilder-inventory.md + docs/ISSUES.md (#113/#114/#115/#99).

DO:
- Phase A: ultracode Workflow fan-out — one mapping agent per area (GfxObj draw,
  building shells, interior cells, statics/dynamics, culling, interior collision),
  each delivering RETAIL (verbatim decomp, Ghidra MCP port 8081 is live with the
  correct PDB; live cdb attach available on request — static cdb -z misdecodes)
  vs ACDREAM (file:line) vs RANKED DIVERGENCES; adversarially verify divergences
  (BN pseudo-C invents branches — proven twice; prefer Ghidra/live-binary).
- First 15-min diagnostic: why the DrawingBSP filter (e46d3d9, un-applied in
  124c6cb) made DOORS vanish — run Issue113PhantomStairsDumpTests' histogram on a
  door GfxObj (get the id from ACE weenie data or a [B.7] pick line).
- Phase B: write the phased port plan (one drawing discipline: BSP/portal-driven
  traversal + portal-view clip), per-phase acceptance criteria naming which bugs
  close (phantom-geometry class, #114 indoor crop, #108, #109, doors, #99,
  particles-through-walls), an explicit keep-list (flood port, straddle gate,
  membership, streaming, bindless MDI), and a migration order that keeps the
  client playable. STOP for user approval before any implementation.

Baseline: Core 1392 + 4 pre-existing #99-era failures + 1 skip / App 226 / UI 420 /
Net 294. The branch state is honest: doors work, the phantom staircase is visible
again (documented), outdoor shell clip on, indoor clip off (#114).

8. Session artifacts (untracked, worktree root)

issue113-user-screenshot-{1,2}.png (the original gate pair, extracted from the transcript), issue113-fix-screenshot{1,2}.png (post-clip pre-gate), issue113-bisect-*.png/log (pre-session build bisect), issue112-ftc-live-disasm.log (the live-binary disassembly of find_transit_cells — keep, it is the straddle-gate proof), issue113-user-gate{,2,3}.log (the three gate launches).