acdream/docs/research/2026-05-22-a6-p3-handoff.md
Erik c479ea68a3 docs(handoff): A6.P3 2026-05-22 EOS handoff + pickup prompt for #98 fix
Comprehensive handoff doc covering today's full A6.P3 work:
  - 13 commits shipped today (slice 2 + slice 3 + slice 4 probes +
    diagnosis)
  - Issue #98 sharply diagnosed via paired retail+acdream cdb captures:
    BSP path-selection bug (Path 5 vs Path 6) at BSPQuery.FindCollisions
    dispatcher
  - All 4 A6.P2 findings status updated (Findings 1, 3 closed; Finding 2
    partially closed + accepted divergence; Finding 4 = issue #95
    separate scope)
  - Failed fix attempts log so next session doesn't re-attempt dead ends
  - Concrete starting steps + file references for the next session
  - Pasteable pickup prompt at the bottom

CLAUDE.md "Currently working toward" block updated to reflect slice 3
ship + #98 sharp diagnosis + handoff doc pointer.

Test suite: 1148 + 8 pre-existing fail (baseline maintained).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 13:32:02 +02:00

15 KiB

A6.P3 handoff — 2026-05-22

Status: A6.P3 slices 1+2+3 SHIPPED. Issue #98 (cellar ascent stuck at top) diagnosed but NOT fixed. Sharp Path-5-vs-Path-6 BSP path-selection target identified with paired retail+acdream cdb evidence. Next session: fix #98 at BSPQuery.FindCollisions path-selection.

Pasteable session-start prompt at the bottom of this doc.


TL;DR

Two full days of A6 work landed:

Day Slice Result
2026-05-21 A6.P1 + A6.P2 + A6.P3 slice 1 (CP retention strip + Mechanism B) Stairs + cellar descent work in acdream. A6.P2 Finding 1 (dispatcher freq) closed as side-effect of Finding 2 (CP-write blowup).
2026-05-22 morning A6.P3 slice 2 (L622 seed; v1 reverted; v2 no-op guard) #96 partially addressed; accepted as documented retail divergence.
2026-05-22 morning A6.P3 slice 3 (cell-resolver stickiness; v1/v2/v3) Cell-resolver ping-pong CLOSED. #90 workaround now redundant (defer A6.P4 removal).
2026-05-22 noon Slice 4 polydump probe + retail cdb capture Pinpointed #98 root cause: our BSP picks Path 5 (Contact→step_up→adjust_sphere push-back) for the cellar ramp polygon when retail picks Path 6 (find_walkable → land on flat floor).

User-visible deltas vs Wed morning baseline (2026-05-20):

  • Inn stairs UP — works (was broken)
  • Cellar descent — works (was broken)
  • 2nd floor walking — works (was broken; with caveats — phantom collisions occasionally)
  • Cellar ASCENT (stuck at top step) — still broken (this is issue #98)
  • Visible-through-walls in dungeons — issue #95 (separate scope)
  • Indoor lighting — A7 scope (separate phase)

What shipped this session (2026-05-22)

Commit What
892019b A6.P3 slice 2 v1: removed L622 per-tick CP seed (CP-write 91% reduction BUT broke BSP step_up at last step of stairs)
f8d669b A6.P3 slice 2 v2: revert v1 + add no-op-if-unchanged guard inside CollisionInfo.SetContactPlane
d868946 Slice 2 ship docs + filed issue #98 (cellar ascent stuck — originally hypothesized as cell-resolver ping-pong)
8898166 A6.P3 slice 3 v1: sphere-overlap stickiness in ResolveCellId (over-corrected; blocked legitimate cell transitions)
3e140cf A6.P3 slice 3 v2: switched to point-in stickiness — cell-resolver ping-pong CLOSED (data confirmed: 1 cell-transit event vs 20+ pre-fix)
ceeb06b Slice 3 ship docs + #98 re-diagnosed (cellar-up symptom persists with NEW cause — BSP step-physics, not cell-resolver)
0b44996 Slice 4: added [poly-dump] probe in AdjustSphereToPlane — verifies dat fidelity by dumping polygon vertices+plane+sidesType on every push-back
3198472 Extended [cell-cache] probe with portalTargets list — shows which cells each portal connects to
8bd3117 A6.P3 slice 3 v3: REVERTED stickiness entirely (hypothesis-test for #98) — cellar-up symptom persists
bbd1df4 Slice 4: WalkInterp reset before placement_insert in DoStepDown (retail-faithful improvement; didn't fix #98 but kept as quality fix)
134c9b8 Retail cellar-up cdb capture — paired evidence for the Path-5 vs Path-6 diagnosis
efb5f2c Issue #98 updated with sharpened diagnosis + failed-attempt log

The sharp diagnosis for issue #98

Symptom: User walks UP the Holtburg cottage cellar in acdream. Runs into "an invisible roof or wall" at the top step. Animation plays but no Z progress. Stuck.

Paired evidence:

Metric Retail (success) Acdream (stuck)
BP1 transitional_insert 2,651 (no acdream BP1 mirror)
BP2 step_up 29 (incl. 1 on ramp slope)
BP4 find_collisions 4,032 push-back-disp ~9000
BP5 adjust_sphere 30 (ALL on FLAT planes) push-back ~1000 (270 on RAMP slope poly 0x0008)
BP6 check_walkable 25 indoor-walkable ~700
BP7 set_contact_plane 18 (all set same flat plane: (0,0,1) d=-93.9998 = world Z=94 = cottage main floor) cp-write 229,300 (varying planes from many sites)
step_up_slide (via BP2 = 29) 159+ hits

The divergence (pinpointed):

For the cellar ramp polygon (cellar cell 0xA9B40147, poly 0x0008, n=(0,-0.719,0.695), 46° walkable slope):

  • Retail's BSP picks Path 6 (find_walkable → land) — treats the ramp as a walkable floor. Smoothly LANDS the sphere on the ramp surface during step_down probe. Sets ContactPlane to the cottage main floor (flat plane at world Z=94 — the END goal of the ascent).

  • Acdream's BSP picks Path 5 (Contact → step_sphere_up → adjust_sphere push-back) — treats the ramp as a wall to push off. The push-back lifts the sphere by 0.75m and consumes all walk-interp. step_up's placement_insert then fails (the lifted position doesn't validate). step_up returns failure → step_up_slide fires → sphere slides along step_up_normal → loop. Player physically stuck.

Both retail and ours classify the ramp as walkable (N.Z=0.695 > FloorZ=0.6642). So the divergence isn't in the walkability check itself. It's in the path-selection logic inside BSPQuery.FindCollisions that decides whether to fire Path 5 vs Path 6 for a given polygon hit.

Code anchors for the next session:

  • src/AcDream.Core/Physics/BSPQuery.csFindCollisions dispatcher. Search for "Path 5" + "Path 6" comments. The path selection branches on ObjectInfo.State (Contact flag) + SpherePath.StepDown + SpherePath.StepUp.
  • The grounded player has Contact flag set (per PhysicsEngine.cs:597-598). So Path 5 fires first. Path 5 calls step_sphere_up → step_up → step_down (with step_up=1) → recursive BSP query.
  • The recursive BSP query (with StepDown=1, StepUp=1) should fire Path 6 — but maybe doesn't, OR fires Path 6 but Path 6's adjust_sphere on the ramp is what produces the broken push-back.
  • Retail's BSP behavior at the same site: step_up fires (BP2 hits), but adjust_sphere only fires on FLAT planes (BP5 all flat). So retail's step_down inside step_up doesn't push the sphere off the ramp slope.

Why the failed attempts today didn't land

Attempt What we tried Why it didn't fix #98
Slice 2 v1 (892019b) — remove L622 seed Eliminate the per-tick CP seed The seed is load-bearing for step_up's AdjustOffset slope-projection on sub-step 1; removed it → all step_up broke
Slice 2 v2 (f8d669b) — no-op guard in SetContactPlane Make redundant CP writes a true no-op Guard doesn't fire for the L622 seed because each tick gets a fresh Transition (ci.ContactPlaneValid=false on entry); useful for OTHER call sites but not the seed
Slice 3 v1 (8898166) — sphere-overlap stickiness Stop cell-resolver ping-pong Over-corrected: held player in cellar even during legitimate transition; cellar-up still stuck
Slice 3 v2 (3e140cf) — point-in stickiness Less aggressive stickiness CLOSED the ping-pong (data confirmed: 1 cell-transit vs 20+) but cellar-up still stuck — bug isn't cell-resolver
Slice 3 v3 (8bd3117) — revert all stickiness Hypothesis test: prove cell-resolver isn't the bug Confirmed — cellar-up still stuck even without stickiness
Slice 4 (bbd1df4) — reset WalkInterp before placement_insert Match retail's walk_interp=1 reset pattern Logical retail-faithful improvement but doesn't unblock cellar-up; kept in tree as quality fix

Common pattern: I was guessing fixes at higher levels (cell resolution, CP retention, walk_interp) when the actual bug is deeper in BSP path-selection. The paired retail cdb capture finally pinpointed the divergence.

State of the four A6.P2 findings

Finding Status as of 2026-05-22 EOS
Finding 1 — dispatcher entry frequency mismatch CLOSED (as side-effect of slice 1 Finding 2 fix)
Finding 2 — ContactPlane resynthesis blowup PARTIALLY CLOSED (slice 1 stripped synthesis; slice 2 v2 added no-op guard; L622 seed retained as documented retail divergence per #96)
Finding 3 — Indoor cell-resolver instability CLOSED (slice 3 v2 point-in stickiness; ping-pong fully eliminated per data)
Finding 4 — Portal-graph visibility blowup OPEN as issue #95 (not A6 scope)

Known open issues touched by A6 work

Issue Status
#83 — Indoor multi-Z walking broken Cellars + 2nd floor walking works; cellar-up still blocked by #98
#88 — Indoor static objects vibrate Unchanged (deferred; hypothesis: closes with Finding 2 family)
#90 — CellId ping-pong workaround Now REDUNDANT after slice 3 v2; defer A6.P4 removal
#95 — Portal-graph visibility blowup OPEN (not A6 scope)
#96 — L622 per-tick CP seed PARTIALLY ADDRESSED, accepted as documented retail divergence
#97 — Phantom collisions + fall-through on 2nd floor OPEN (not re-tested post-slice-3-revert; hypothesis: same Path-5/Path-6 family as #98)
#98 — Cellar ascent stuck at top step OPEN — sharp Path-5-vs-Path-6 diagnosis ready for next session

Test suite status

1148 pass + 8 pre-existing fail (baseline maintained throughout the session).

Next session — concrete starting steps

Goal: Fix #98 (cellar ascent stuck at top step) by correcting BSPQuery.FindCollisions path-selection so the cellar ramp triggers Path 6 (find_walkable land) instead of Path 5 (Contact step_up push-back).

Approach:

  1. Read retail's BSPTREE::find_collisions dispatcher at acclient_2013_pseudo_c.txt (search for BSPTREE::find_collisions). Note exactly which path it picks for a grounded mover hitting a walkable slope. The 6-path dispatcher is at line ~322984 (where BP4 sits).

  2. Read our BSPQuery.FindCollisions at src/AcDream.Core/Physics/BSPQuery.cs:1500+. Identify the path-selection branch that decides Path 5 vs Path 6 for the input (grounded=true, step_down=false, step_up=false, polygon.N.Z=0.695) case.

  3. Compare line-by-line. Likely candidates for the divergence:

    • Wrong state flag check (e.g. checking Contact when retail checks something else)
    • Wrong walkability gate (e.g. requiring N.Z >= LandingZ when retail requires >= FloorZ)
    • Wrong polygon-sidedness check (one-sided poly being treated as two-sided or vice versa)
    • Off-by-one in path numbering (Path 5 vs Path 6 swapped in our port)
  4. Fix surgically + verify via re-capture. Re-run the cellar-up scenario in acdream with ACDREAM_PROBE_POLY_DUMP=1. Compare the post-fix [push-back] distribution against retail's BP5 distribution from 134c9b8 capture. Target: zero push-back hits on the ramp slope; CP set to flat cottage floor (matching retail).

  5. If the fix lands cleanly: also re-test #97 (phantom collisions + fall-through on 2nd floor — likely closes as side-effect because it's the same family).

Files almost certainly touched by the fix:

  • src/AcDream.Core/Physics/BSPQuery.cs — path-selection in FindCollisions
  • Possibly src/AcDream.Core/Physics/PhysicsGlobals.cs (LandingZ vs FloorZ threshold mismatch)

Files that DON'T need changing (already correct per today's investigation):

  • PhysicsEngine.cs ResolveCellId (cell-resolver works post-slice-3)
  • PhysicsEngine.cs L622 seed (retail divergence accepted)
  • TransitionTypes.cs ValidateTransition (Mechanism B works)
  • TransitionTypes.cs FindEnvCollisions indoor branch (slice 1 strip is correct)

Captures available for the next session

Capture What it shows
docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_polydump/acdream.log Acdream stuck-at-cellar trace with [poly-dump] lines showing the ramp polygon vertices
docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_portaldump/acdream.log Same cellar with [cell-cache] portalTargets=... showing the cellar's portals to 0x0146 + 0x0148
docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_retail_for_issue98/retail.{log,decoded.log} Retail's successful cellar-up cdb trace — the gold-standard comparison data
docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_slice3v2/acdream.log Pre-slice-3-revert cell-transit pattern (closed ping-pong, point-in stickiness)
docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_slice3v3_revert/acdream.log Post-slice-3-revert (no stickiness) — cellar-up still stuck → confirms cell-resolver isn't the bug

Pickup prompt for fresh session

Open a new Claude Code session at this worktree's branch (claude/strange-albattani-3fc83c, HEAD at efb5f2c). Then paste:


Pick up A6.P3 — fix issue #98 (cellar ascent stuck at top step).

Read FIRST:
  docs/research/2026-05-22-a6-p3-handoff.md
  docs/ISSUES.md issue #98 entry (sharp diagnosis section)

Then state both altitudes:
  Currently working toward: M1.5 — Indoor world feels right
  Current phase: A6.P3 — fix issue #98 BSP path-selection
  Next concrete step: read retail's BSPTREE::find_collisions
  dispatcher (acclient_2013_pseudo_c.txt) + our BSPQuery.FindCollisions
  side-by-side; identify why our code picks Path 5 (Contact step_up)
  for the cellar ramp polygon when retail picks Path 6 (find_walkable
  land). The ramp is walkable (N.Z=0.695 > FloorZ=0.6642) so Path 6 is
  the correct choice for both clients.

Sharp diagnosis (from paired cdb captures committed 2026-05-22):
  - Retail's adjust_sphere fires 30x ALL on flat planes (Z=94 cottage main floor)
  - Acdream's push-back fires 270x on the RAMP slope (cellar 0xA9B40147 poly 0x0008)
  - Retail's BP7 set_contact_plane fires 18x with the SAME flat plane
  - Acdream cp-write fires 229,300x with varying planes from many sites

Captures available for comparison:
  - docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_retail_for_issue98/
    (retail cellar-up cdb trace — gold-standard data)
  - docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_polydump/
    (acdream stuck-at-cellar with [poly-dump] lines)

DO NOT re-attempt the failed fixes from 2026-05-22 (handoff doc has
the full list with reasons each one didn't land). Specifically:
  - Don't try removing the L622 seed (breaks step_up)
  - Don't try removing slice-3 stickiness (already reverted; didn't help #98)
  - Don't try cell-resolver fixes (Finding 3 is closed)

Fix expected in BSPQuery.cs path-selection (the dispatcher branch
that decides Path 5 vs Path 6 for grounded movers hitting walkable
polys). Likely 5-20 lines of code change once the divergence is found.

After fix lands: re-capture scen4_cottage_cellar with the same probe
env vars to verify acdream now matches retail's flat-plane BP7
pattern. Also re-test #97 (phantom collisions + fall-through on 2nd
floor — hypothesized to close as side-effect of #98 fix).

Test suite baseline: 1148 pass + 8 pre-existing fail. Maintain through
the fix.

CLAUDE.md rules apply. No workarounds without explicit user approval.
Three failed visual verifications = handoff (we hit this 4x on the
2026-05-22 session — discipline check before attempting another guess
fix).

References