test(p2): faithful cellar-lip wedge reproduction + investigation apparatus (no fix yet)

P2 / M1.5 "blocked at the last step" cellar-lip wedge. This session built a faithful
deterministic reproduction and peeled the cause through six evidence-disproven framings
to one bounded question. NO fix landed — the last layers were each disproven by evidence,
and guessing at the load-bearing collision code is the saga's failure mode.

Apparatus:
- CellarLipWedgeTests.cs + Fixtures/cellar-lip/ (3 real cell dumps + wedge-records.jsonl =
  29 captured ACDREAM_CAPTURE_RESOLVE wedge calls). Replays the exact calls + body-before
  through the lip-cell engine: all 29 reproduce at 0% advance in <200 ms. Tests are
  documents-the-bug / diagnostics (GREEN while the wedge exists).
- TEMP probes ([path5-wall]/[fw-enter]/[find-walkable] in BSPQuery; [neg-poly]/[stepsphereup]/
  [stepdown-decide]/CheckOtherCells cn/sn/negHit in TransitionTypes), gated on
  ACDREAM_PROBE_INDOOR_BSP, marked STRIP. TransitionTypes neg-poly shortcut has a reverted-fix
  comment (slide attempt didn't clear the wedge).
- tools/cdb/retail-*-trace.cdb (retail cdb traces).

Findings (handoff: docs/research/2026-06-04-p2-cellar-lip-flatfloor-cp-handoff.md, see the
"NEXT-SESSION KICKOFF" at top):
- Flat-floor contact plane is retail-faithful (v1 trace, full-file correlation). NOT the bug.
- PosHitsSphere cull sign is retail-faithful (cdb -z verified; the Binary Ninja `test ah,N; jp`
  parity-jump reads inverted — caught + reverted a wrong fix from that mis-read).
- Sphere radius correct (0.48 player / 0.30 camera probe).
- Retail connector cell 0xA9B40175 never blocks (CEnvCell::find_collisions trace: 0 Collided/Slid).
- PINNED: during the step-up's step-down, BSPQuery.FindWalkableInternal is never called for cell
  0171, so the cottage floor (poly 0x0023, Z=94) is never tested as walkable -> no contact plane
  -> step-up fails -> StepUpSlide=Collided -> wedge.

Next: trace FindEnvCollisions -> FindCollisions path dispatch for 0171 during StepDown=true (why
StepSphereDown/find_walkable is skipped), port retail, validate via CellarLipWedgeTests, regress
DoorBugTrajectoryReplayTests + visual gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-06-05 08:30:36 +02:00
parent 57435e912b
commit bc1be26907
12 changed files with 5824 additions and 3 deletions

View file

@ -0,0 +1,34 @@
$$ Retail per-cell collide trace (2026-06-04 — the DECISIVE discriminator).
$$
$$ Question: when retail crosses the cellar lip, what does the per-cell collide
$$ CEnvCell::find_collisions RETURN for the connector cell 0xA9B40175 and the floor
$$ cell 0xA9B40171 (and 0174)? This is the thing check_other_cells calls per cell
$$ (vtable[0x88]). enum: 1=OK 2=COLLIDED 3=ADJUSTED 4=SLID.
$$
$$ - If 0175 returns mostly 1 (OK) -> acdream's connector Slid is SPURIOUS
$$ (acdream over-detects / over-steps-up 0175). Fix = stop acdream blocking 0175.
$$ - If 0175 returns 4 (SLID) too -> retail slides+continues (no revert);
$$ acdream's wedge is the substep REVERT, not the collide. Look upstream.
$$
$$ ROBUST BY CONSTRUCTION (verified offline via `cdb -z acclient.exe`):
$$ CEnvCell::find_collisions @0x52c100 has a SINGLE exit at +0x1e (0x52c11e):
$$ 0052c11e pop edi <-- esi still = this (CEnvCell*), eax = result
$$ 0052c11f pop esi
$$ 0052c120 ret 4
$$ cell id = poi(esi+0x28) (CEnvCell.m_DID). No `gu`, no `qd` in the action.
$$ Reading this+result directly at the single exit is nesting-safe.
$$
$$ STEP 0 (optional re-verify): the `uf` below re-dumps the function; confirm the
$$ single exit is still at +0x1e and esi=this there, before driving.
$$
$$ Close retail when done to detach cdb (debuggee exit detaches; no qd needed).
.logopen C:\Users\erikn\source\repos\acdream\.claude\worktrees\thirsty-goldberg-51bb9b\retail-connector-collide-trace.log
.sympath C:\Users\erikn\source\repos\acdream\refs
.symopt+ 0x40
.reload /f acclient.exe
uf acclient!CEnvCell::find_collisions
$$ Log EVERY return for the lip cluster 0xA9B4017X (0171 floor / 0174 / 0175 connector).
$$ Also log any NON-OK (ret!=1) return for ANY Holtburg cell 0xA9B4xxxx (catch blocks
$$ outside the 017X cluster). Low volume; cheap action + gc.
bp acclient!CEnvCell::find_collisions+0x1e ".if ((poi(@esi+0x28) & 0xFFFFFFF0) == 0xA9B40170) { .printf \"lip cell=0x%x ret=%d\\n\", poi(@esi+0x28), @eax } .elsif (((poi(@esi+0x28) & 0xFFFF0000) == 0xA9B40000) & (@eax != 1)) { .printf \"blk cell=0x%x ret=%d\\n\", poi(@esi+0x28), @eax }; gc"
g

View file

@ -0,0 +1,26 @@
$$ Retail flat-floor contact-plane trace (2026-06-04, v2 — CORRECTED).
$$ Decisive question: when retail lands on the FLAT cottage floor during the climb, does
$$ BSPTREE::step_sphere_down SET the contact plane (return 3) or NOT (return 1)?
$$
$$ v1 (gu-in-bp-action) FAILED: "commands skipped ... target execution inside an event
$$ handler" corrupted eax -> a perfect 1,3,1,3 alternation artifact. DO NOT use `gu` in a
$$ bp action. v2: stash the carried cell in $t3 at ENTRY (arg2 = sphere_path at [esp+4],
$$ before the prologue), then break at the TWO RETURN addresses and print SET/NO + $t3.
$$
$$ STEP 0 (do this FIRST, before driving): the +0x218 (return 3) / +0x227 (return 1) offsets
$$ below are from the decomp (step_sphere_down @0x53a210; return 3 @0x53a428; return 1 @0x53a437)
$$ and MUST be verified against the live binary. After attaching, read the log: the `u` output
$$ (below) disassembles the function — confirm which addresses load eax=3 vs eax=1 (or jmp to the
$$ shared epilogue) and FIX the two `bp ...+0xNNN` offsets if they differ, then re-attach.
$$
$$ No qd / no Stop-Process needed if the user closes retail (debuggee exit detaches cdb).
.logopen C:\Users\erikn\source\repos\acdream\.claude\worktrees\thirsty-goldberg-51bb9b\retail-flatfloor-trace.log
.sympath C:\Users\erikn\source\repos\acdream\refs
.symopt+ 0x40
.reload /f acclient.exe
u acclient!BSPTREE::step_sphere_down L90
r @$t3 = 0
bp acclient!BSPTREE::step_sphere_down "r @$t3 = @@c++(((acclient!SPHEREPATH *)poi(@esp+4))->check_pos.objcell_id); gc"
bp acclient!BSPTREE::step_sphere_down+0x218 ".printf \"SET-CP cell=0x%x\\n\", @$t3; gc"
bp acclient!BSPTREE::step_sphere_down+0x227 ".printf \"NO-CP cell=0x%x\\n\", @$t3; gc"
g

View file

@ -0,0 +1,13 @@
$$ Retail cellar-lip trace (2026-06-04). Captures, per CTransition::step_up,
$$ the CARRIED cell (sphere_path.check_pos.objcell_id) + world position
$$ (check_pos.frame.m_fOrigin). Discriminates whether retail's carried cell
$$ STAYS STABLE at the cellar lip (-> acdream's mid-step-up cell-flip is the bug)
$$ or ALTERNATES like acdream (-> the connector-cell slide is the bug).
$$ Auto-detaches (qd) after 150 step_ups so retail keeps running.
.logopen C:\Users\erikn\source\repos\acdream\.claude\worktrees\thirsty-goldberg-51bb9b\retail-lip-trace.log
.sympath C:\Users\erikn\source\repos\acdream\refs
.symopt+ 0x40
.reload /f acclient.exe
r @$t0 = 0
bp acclient!CTransition::step_up "r @$t0 = @$t0 + 1; .printf \"--- stepup #%d ---\\n\", @$t0; dt acclient!CTransition @ecx sphere_path.check_pos.objcell_id sphere_path.check_pos.frame.m_fOrigin.x sphere_path.check_pos.frame.m_fOrigin.y sphere_path.check_pos.frame.m_fOrigin.z; .if (@$t0 >= 150) { .printf \"=== DETACH after %d step_ups ===\\n\", @$t0; qd } .else { gc }"
g