acdream/docs/research/2026-05-21-collision-fixes-shipped-handoff.md
Erik 56d2b5e4a1 docs(physics): handoff for 2026-05-21 collision-fix session
Captures everything that shipped in the session — A1, A1.5, A1.6,
A1.7 plus the walk-miss probe spike — and what's still open:

- A4 (multi-cell BSP iteration) — the next big architectural fix,
  closes the "walls walk-through-able in vestibule cells" gap
- A2 (PHSP inversion) — small fix, but only meaningful paired with A3
- A3 (synthesis removal) — needs A4 in place first to avoid
  reverting back to Bug A's free-fall regression
- Lighting bugs (indoor lighting + spotlight projection) — M7 polish,
  separate session

Includes per-fix commit SHAs, code anchors, retail decomp anchors,
probe + launch reference, anti-patterns, and a fresh-session pickup
prompt for boxing into Claude Code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 14:42:47 +02:00

22 KiB
Raw Blame History

Collision fixes — session 2026-05-21 shipped handoff

Branch: claude/lucid-goldberg-1ba520 Worktree: C:\Users\erikn\source\repos\acdream\.claude\worktrees\lucid-goldberg-1ba520 Commits ahead of main: 8 (probe spike + 4 fixes + docs)

TL;DR

User reported the world feeling buggy — collision in thin air inside and outside buildings, walls walk-through-able in spots. A two-step investigation surfaced a foundation-level math bug (PolygonHitsSpherePrecise inverted vs retail) and four discrete registration / cell-tracking bugs. Four surgical fixes landed this session (A1, A1.5, A1.6, A1.7) plus a [walk-miss] / [floor-polys] diagnostic probe set that quantified the bug rates. What's left is one architectural change (A4: multi-cell BSP iteration) and three smaller code-correctness items. Visual verification at the end of each phase confirmed forward progress; remaining wall-walkthroughs in vestibule cells are the A4 gap.

What shipped this session

Probe spike (3 commits)

SHA What Why
27c7284 ProbeWalkMissEnabled flag + roundtrip test Diagnostic gate for ISSUES #83 H-disambiguation
31da57c WalkMissDiagnostic aggregator + 2 logic tests Pure-function aggregator over CellPhysics.Resolved
a2e7a87 [walk-miss] + [floor-polys] emission sites Wire flag + aggregator into Transition.FindEnvCollisions MISS branch + PhysicsDataCache.CacheCellStruct
bb1e919 Spec + plan + findings docs The doc artifacts for the spike

The walk-miss probe produced the smoking-gun analysis in docs/research/2026-05-21-walk-miss-capture-findings.md: 0.38 % synthesis HIT rate, with a 2 cm boundary between HIT (dz≈0.46 m) and MISS (dz≈0.48 m) at sphere radius 0.480 m. This proved PolygonHitsSpherePrecise is inverted vs retail's polygon_hits_sphere_slow_but_sure (BSPQuery.cs:117 vs acclient_2013_pseudo_c.txt:322509-322517). That's Phase A2, still pending.

Collision fixes (4 commits)

Phase SHA Fix
A1 5f2b545 Skip mesh-AABB-fallback cylinder for landblock stabs. Stabs (entity.Id 0xC0XXYY00+n) had their per-part BSP shadow correctly registered AND a redundant 1.5 m-clamped invisible cylinder at the mesh origin. The cylinder was the "thin air" collision inside cottages. Gate: _isLandblockStab = (entity.Id & 0xFF000000u) == 0xC0000000u.
A1.5 4d3bf6f Scope interior cell shadows to ParentCellId. ShadowObjectRegistry.Register assigned every entity to outdoor landcells based on XY. Interior statics (fireplace, furniture in cell 0xA9B40121) got stamped into the outdoor landcell whose XY they overlapped (e.g., 0xA9B40029), firing collisions for players walking OUTSIDE the building. New optional cellScope parameter, passed entity.ParentCellId ?? 0u from all 5 entity-loop call sites.
A1.6 700abad Skip Setup CylSphere/Sphere shadows for landblock stabs. A1 only gated the mesh-AABB-fallback path. Setup-derived registrations (lines 5910-6005 in GameWindow) still fired for stabs whose source is a Setup with CylSpheres. Same _isLandblockStab gate, extended to the outer if (setup is not null) block.
A1.7 4679134 Fall through to outdoor cell when indoor BSP doesn't contain player. CellTransit.FindCellList returns currentCellId when no candidate cell's CellBSP contains the sphere — but this also fired when the player walked OUTSIDE the entire portal-connected indoor graph. The player's CellId was stuck on an old indoor cell whose BSP was geometrically far away; every indoor-bsp query returned OK at the BSP root; no walls blocked. Fix: after FindCellList, verify with PointInsideCellBsp; if not inside, fall through to the existing outdoor resolution branch.

Visual verification at each phase

Each fix was visually verified by walking the same buildings before/after:

  • A1: "thin air" inside cottage GONE.
  • A1.5: "thin air" outside buildings → 71/97 interior-static-leak hits down to 0.
  • A1.6: Setup-CylSphere bleed around buildings cleared.
  • A1.7: cell-id correctly transitions between indoor doorway cell and adjacent outdoor cell on building exit.

What's still broken

Per end-of-session user testing:

  1. Walls walk-through-able in "vestibule" cells. Some interior cells (e.g., the Holtburg cell 0xA9B40164) have very few physics polygons — only 4 polys, BSP bounding sphere of 2 m radius. When the player walks past the doorway, they're geometrically inside a neighboring cell's actual walls — but the collision check only queries the cell the player's center is "in." That cell (the vestibule) has no walls there. The neighboring cell's walls (e.g., 0xA9B40157 with 23 polys, 38 % hit rate when the player IS there) are never queried.
  2. Stairs walk-through. Likely the same multi-cell iteration gap — stairs span cell boundaries.
  3. Lighting indoors broken. Separate rendering concern; M7 polish.
  4. Items projecting spotlight on walls. Per-entity light direction bug; M7 polish.
  5. PHSP inversion (A2). Still pending. The [walk-miss] data proved this bug exists but fixing it alone doesn't fix walkable synthesis at the tangent boundary — needs to pair with synthesis removal (A3).
  6. Synthesis architecturally wrong (A3). Retail's grounded path never re-synthesizes ContactPlane; it retains via Mechanisms A/B/C. Our TryFindIndoorWalkablePlane runs every frame and is the wrong shape. Removing it is Bug A from the 2026-05-20 session — was tried + reverted because retention had its own gaps. A1.7 closed one of those gaps; A2 + A4 close the others.

The architectural picture (plain-English)

acdream's world is divided into invisible chunks called cells. There are two flavors:

  • Outdoor cells: the world is gridded into 24 m × 24 m squares. Each landblock (the 192 m × 192 m unit of streaming) has 64 such cells in an 8 × 8 grid. They get cell IDs like 0xA9B40029.
  • Indoor cells: each room (or section of room) inside a building gets its own cell. They're not grid-aligned — they follow the building's interior partitioning. Cell IDs have the high bit of the low-16 set, e.g. 0xA9B40157.

Each cell carries:

  • A CellBSP — defines the volume the cell occupies in space (used for "is this point inside this cell?" lookups during cell-id resolution).
  • A PhysicsBSP — the collision geometry (walls, floors, stairs) the player can hit.
  • Portals — connections to adjacent indoor cells (think doorways).
  • Static objects — furniture, decoration meshes hydrated as entities.

The collision system asks two things per frame:

  1. What cell is the player in? Driven by PhysicsEngine.ResolveCellIdCellTransit.FindCellList. Walks the portal graph from the current cell, picks the cell whose CellBSP contains the sphere center. With A1.7, when no indoor cell claims the player, falls through to outdoor landcell resolution.
  2. Does the player hit anything? Drives Transition.FindEnvCollisions. Queries the one cell the player is "in" — its PhysicsBSP for walls/floor and its shadow-registered statics for furniture.

The architectural gap is step 2 only queries one cell. Retail queries the cell_array — the sphere center's cell plus every other cell the sphere geometrically overlaps. So if you're in a vestibule cell with no real walls but your shoulder pokes into the next room's wall, retail's collision sees the wall. acdream doesn't.

Phase A4 — multi-cell iteration (the next big fix)

This is the gap. Implementation sketch:

What to port from retail

CTransition::check_other_cells at acclient_2013_pseudo_c.txt:272717-272798. After the primary cell's find_collisions runs, it iterates every other cell in this->cell_array (built from CObjCell::find_cell_list which fills via interior portals + add_all_outside_cells for outdoor neighbors). For each cell:

  • Calls the cell's vtable find_collisions.
  • On Slid (4): clears contact_plane_valid, returns.
  • On Collided (2) or Adjusted (3): returns immediately.
  • On OK: continues to the next cell.

If the sphere is geometrically outside the original cell, the fallback (line 272761-272797) sets check_cell = var_4c (the cell containing the final position) and adjusts check_pos.objcell_id.

What we already have

Phase 2 portal cell-tracking is shipped (commits 1969c55eb0f772, 2026-05-19). It gives us:

  • CellTransit.FindCellList (sphere variant) — top-level driver.
  • CellTransit.FindTransitCellsSphere — interior portal neighbour expansion.
  • CellTransit.AddAllOutsideCells — outdoor landcell neighbour expansion.
  • CellPhysics.VisibleCellIds — pre-computed visible-cell set per cell.

These currently feed cell-id resolution (step 1 above). They are NOT yet used to drive collision iteration (step 2). A4's job is to wire them into Transition.FindEnvCollisions.

Implementation outline for A4

  1. In Transition.FindEnvCollisions (src/AcDream.Core/Physics/TransitionTypes.cs:1407-1559):
    • Currently: queries one cell (engine.DataCache.GetCellStruct(sp.CheckCellId)) and runs BSPQuery.FindCollisions against its BSP.
    • Change to: build the cell_array from the current cell using CellTransit.FindCellList (or a new variant that returns the full set), then iterate each cell and run BSP collision against each. Combine results.
  2. Combine semantics match retail's check_other_cells:
    • Any cell returning Collided (2) or Adjusted (3) → return that immediately (halt iteration).
    • Any cell returning Slid (4) → record but continue (in case another cell collides harder). After all cells: return Slid.
    • All cells OK → return OK.
  3. Outdoor case: if the resolved cell is outdoor, iterate adjacent outdoor landcells via AddAllOutsideCells and any indoor cells accessible via building portals (CheckBuildingTransit). Both already exist as helpers.
  4. Shadow objects (the L.2d [resolve-bldg] path) likely also need multi-cell awareness — FindObjCollisions only checks shadows keyed to the player's current cell. After A1.5, interior shadows are scoped to their ParentCellId, so multi-cell iteration automatically picks them up too.
  5. Testing strategy:
    • Unit tests: synthetic two-cell fixture where wall lives in cell B and player is in cell A's vestibule. Assert collision fires.
    • Live capture: walk the Holtburg inn vestibule (0xA9B40164) and verify walls in 0xA9B40157 now block.
  6. Performance: each cell query is ~50 µs. Multi-cell iteration visits ~3-7 cells in worst case. ~200-350 µs extra per resolve. At 30 Hz that's ~10 ms/sec. Acceptable.

Risks

  • R1: shadow objects in cells visible from multiple positions may get tested multiple times in one frame. Need dedup via the existing _entityToCells map.
  • R2: cells in cell_array may have stale CellPhysics (loaded for rendering but not for physics). Guard with cellPhysics?.BSP?.Root is not null.
  • R3: the existing BSPQuery.FindCollisions mutates Transition state (SpherePath.CheckPos, CollisionInfo). Running it multiple times per frame requires either save/restore between cells or letting the first-hit's mutations stand (matching retail).

Other pending items

Phase A2 — PHSP inversion fix

BSPQuery.PolygonHitsSpherePrecise at BSPQuery.cs:117 has its early-return condition inverted vs retail's polygon_hits_sphere_slow_but_sure at acclient_2013_pseudo_c.txt:322509-322517. Ours bails when sphere is FAR from plane; retail bails when sphere is OVERLAPPING plane.

The actual fix is one line, but it doesn't fix walkable synthesis on its own (because AdjustSphereToPlane still rejects tangent). It DOES affect wall-collision precision at the tangent boundary. Pair with A3 (synthesis removal) for the full benefit.

Phase A3 — synthesis removal

Delete TryFindIndoorWalkablePlane (TransitionTypes.cs:1294) and rely on the three retail CP retention mechanisms (Mechanisms A/B/C). The previous session (2026-05-20) tried this and reverted because multi-cell iteration was missing, so doorway transitions caused free-fall. With A1.7 + A4 in place, A3 should work.

Lighting bugs

  • Indoor lighting broken: probably cell-light association or visibility culling for lights inside cells.
  • Spotlight projection: per-entity light direction transform.

These are M7 polish, separate phase. Not blocking M2 ("kill a drudge").

How to start a fresh session

Copy the block below into a new Claude Code session in the acdream worktree:


Pick up the acdream collision-fix work from the 2026-05-21 session.

1. Read docs/research/2026-05-21-collision-fixes-shipped-handoff.md
   FIRST. It captures everything that shipped (4 fixes A1/A1.5/A1.6/A1.7
   + a probe spike) and what's left (Phase A4 multi-cell iteration is
   the next major user-visible win).

2. Current branch state: claude/lucid-goldberg-1ba520 is 8 commits
   ahead of main. All 4 fixes have visual verification + no regression
   in the 1129-test baseline. Ready to merge or build A4 on top.

3. The next phase to design + ship is **A4 (multi-cell BSP iteration)**.
   Sketch in §"Phase A4" of the handoff. Reads retail's
   CTransition::check_other_cells (acclient_2013_pseudo_c.txt:272717-272798).
   Wires the existing CellTransit helpers (FindCellList,
   FindTransitCellsSphere, AddAllOutsideCells) into
   Transition.FindEnvCollisions so collision is queried against ALL
   cells the sphere overlaps, not just the one cell the player's
   center is in.

4. CLAUDE.md rules apply:
   - No workarounds. Retail-faithful.
   - Probe-first, design-second. Already have [indoor-bsp] +
     [cell-transit] + [cell-cache] probes available.
   - Use the superpowers:brainstorming skill before writing code.
     A4 is a real architectural change deserving its own spec.
   - Visual verification at the Holtburg inn (cell 0xA9B40164
     vestibule) is the acceptance test — walls in cell 0xA9B40157
     should block when the player is "in" 0xA9B40164 but their sphere
     extends into 0xA9B40157.

5. M2 ("kill a drudge") is the active milestone. Indoor walking
   robustness is on the M2 critical path because dungeons have
   drudges. A4 is the last big collision fix needed for M2's
   "walkable indoor space" demo target.

6. Launch command (same as this session):
   $env:ACDREAM_DAT_DIR              = "$env:USERPROFILE\Documents\Asheron's Call"
   $env:ACDREAM_LIVE                 = "1"
   $env:ACDREAM_TEST_HOST            = "127.0.0.1"
   $env:ACDREAM_TEST_PORT            = "9000"
   $env:ACDREAM_TEST_USER            = "testaccount"
   $env:ACDREAM_TEST_PASS            = "testpassword"
   $env:ACDREAM_DEVTOOLS             = "1"
   $env:ACDREAM_PROBE_INDOOR_BSP     = "1"
   $env:ACDREAM_PROBE_CELL           = "1"
   $env:ACDREAM_PROBE_CELL_CACHE     = "1"
   dotnet build -c Debug
   dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 |
       Tee-Object -FilePath "launch-a4.log"

   DO NOT set ACDREAM_PROBE_RESOLVE — it lagged the client this
   session (400k+ log lines at 30 Hz).

State the milestone + chosen phase in your first action.

Anti-patterns from this session

  1. Don't enable ACDREAM_PROBE_RESOLVE for live captures. It emits one line per resolve call at 30 Hz, producing 400k+ lines per session and making the client laggy enough that the user couldn't move. Use the lighter [indoor-bsp] + [cell-transit] probes instead.

  2. Don't assume "walk through wall" means PHSP inversion. This session walked through that misconception twice. The actual cause was different bugs each time (doubled cylinders, interior shadow bleed, cell-id stuck, missing physics polys in vestibule cells). Always capture probe data before designing fixes.

  3. Don't merge A1.5's pattern (cellScope: entity.ParentCellId) without understanding that interior shadows might need MULTI-cell scope, not just their parent cell. A1.5 fixed the obvious leak but introduced "stairs span cells" gaps. The real fix needs A4.

  4. Don't skip visual verification between fixes. Each of A1, A1.5, A1.6, A1.7 was visually confirmed before moving to the next. The user reported what was still broken at each step, which guided the next fix. Without that loop, we'd have shipped a "fix" that broke something else.

  5. Don't try to fix lighting bugs in the same session as collision bugs. Different domain (rendering, not physics). Defer to its own session.

Code anchors

This session's fixes (in commit order)

What A4 will touch

Retail decomp anchors

  • acclient_2013_pseudo_c.txt:272717-272798CTransition::check_other_cells (A4 oracle).
  • :272565-272582validate_transition Mechanism B (LKCP proximity).
  • :273242-273340transitional_insert Mechanism C (step-down probe).
  • :322032-322077CPolygon::adjust_sphere_to_plane.
  • :322403-322500CPolygon::polygon_hits_sphere.
  • :322504-322593CPolygon::polygon_hits_sphere_slow_but_sure (A2 oracle — inversion).
  • :322974-322993CPolygon::pos_hits_sphere (front-face culling).
  • :323725-323939BSPTREE::find_collisions (full 6-path dispatcher).
  • :326211-326242BSPNODE::find_walkable.
  • :326706-326727BSPLEAF::sphere_intersects_poly.
  • :326793-326816BSPLEAF::find_walkable.

Probe + diagnostic reference

Env var Volume When to use
ACDREAM_PROBE_INDOOR_BSP Low (indoor cells only) Wall walk-through investigations. Logs cell, wpos, lpos, result, hit poly.
ACDREAM_PROBE_CELL Very low (cell change events) Cell-tracking issues. Logs old → new cell + position.
ACDREAM_PROBE_CELL_CACHE One-shot per cell load When you need cell BSP poly counts + bsphere. Identifies "vestibule" cells with sparse geometry.
ACDREAM_PROBE_WALK_MISS High (per-frame MISS) Walkable synthesis investigations (Phase A2/A3 work).
ACDREAM_PROBE_BUILDING Medium Building-shadow attribution. Multi-line [resolve-bldg] per hit.
ACDREAM_PROBE_RESOLVE VERY HIGH — DO NOT USE FOR LIVE PLAY Per-resolve attribution. 30 Hz × per-entity = 400k+ lines/session. Lagged the client this session.
ACDREAM_PROBE_CONTACT_PLANE Medium CP retention investigations. Bug B from 2026-05-20 era.

Log analysis recipe

# 1. Convert UTF-16LE to UTF-8 for grep:
Get-Content launch.log -Encoding Unicode | Out-File launch.utf8.log -Encoding utf8

# 2. Quick counts:
grep -c '\[indoor-bsp\]' launch.utf8.log
grep -c '\[cell-transit\]' launch.utf8.log

# 3. Per-cell hit rate:
grep '\[indoor-bsp\] cell=0xA9B40164' launch.utf8.log | grep -oE 'result=[A-Za-z]+' | sort | uniq -c

What this is NOT

This is NOT a complete fix for indoor walking. Walls walk-through-able remain in cells where the PhysicsBSP has sparse coverage (vestibule cells). A4 closes that gap by querying multiple cells per frame — which is exactly what retail does.

This is NOT related to the PHSP inversion (A2). A2 fixes per-poly overlap math precision at the tangent boundary. A4 fixes which cells get queried. They're orthogonal.

This is NOT related to the lighting bugs the user reported. Those are rendering-side; ignore in any collision work.

References