acdream/docs/research/2026-06-04-p2-cellar-corner-stepup-handoff.md
Erik bc1be26907 test(p2): faithful cellar-lip wedge reproduction + investigation apparatus (no fix yet)
P2 / M1.5 "blocked at the last step" cellar-lip wedge. This session built a faithful
deterministic reproduction and peeled the cause through six evidence-disproven framings
to one bounded question. NO fix landed — the last layers were each disproven by evidence,
and guessing at the load-bearing collision code is the saga's failure mode.

Apparatus:
- CellarLipWedgeTests.cs + Fixtures/cellar-lip/ (3 real cell dumps + wedge-records.jsonl =
  29 captured ACDREAM_CAPTURE_RESOLVE wedge calls). Replays the exact calls + body-before
  through the lip-cell engine: all 29 reproduce at 0% advance in <200 ms. Tests are
  documents-the-bug / diagnostics (GREEN while the wedge exists).
- TEMP probes ([path5-wall]/[fw-enter]/[find-walkable] in BSPQuery; [neg-poly]/[stepsphereup]/
  [stepdown-decide]/CheckOtherCells cn/sn/negHit in TransitionTypes), gated on
  ACDREAM_PROBE_INDOOR_BSP, marked STRIP. TransitionTypes neg-poly shortcut has a reverted-fix
  comment (slide attempt didn't clear the wedge).
- tools/cdb/retail-*-trace.cdb (retail cdb traces).

Findings (handoff: docs/research/2026-06-04-p2-cellar-lip-flatfloor-cp-handoff.md, see the
"NEXT-SESSION KICKOFF" at top):
- Flat-floor contact plane is retail-faithful (v1 trace, full-file correlation). NOT the bug.
- PosHitsSphere cull sign is retail-faithful (cdb -z verified; the Binary Ninja `test ah,N; jp`
  parity-jump reads inverted — caught + reverted a wrong fix from that mis-read).
- Sphere radius correct (0.48 player / 0.30 camera probe).
- Retail connector cell 0xA9B40175 never blocks (CEnvCell::find_collisions trace: 0 Collided/Slid).
- PINNED: during the step-up's step-down, BSPQuery.FindWalkableInternal is never called for cell
  0171, so the cottage floor (poly 0x0023, Z=94) is never tested as walkable -> no contact plane
  -> step-up fails -> StepUpSlide=Collided -> wedge.

Next: trace FindEnvCollisions -> FindCollisions path dispatch for 0171 during StepDown=true (why
StepSphereDown/find_walkable is skipped), port retail, validate via CellarLipWedgeTests, regress
DoorBugTrajectoryReplayTests + visual gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 08:30:36 +02:00

14 KiB
Raw Blame History

P2 pickup — cellar-top corner wedge = cell-resolver ping-pong (re-diagnosed) reverting a WORKING step-up

🟢 SUPERSEDED 2026-06-04 PM — the wedge is NOT membership and NOT a reverted landing. Canonical findings + full evidence chain are now in memory/project_p2_door_stepup_findings.md (the "RE-DIAGNOSIS 2" + "SLIDE LOCALIZED" + "FAILING CONDITION PINNED" entries). One-line summary: a live retail cdb trace proved retail's carried cell ALSO flips 0174/0175/0171 at the lip yet retail is smooth → membership ruled out. The wedge is a step-up coin-flip: the step-up's internal step-down FAILS to set a contact plane on the FLAT cottage floor (cpValid=False, walkInterp=1.0) while it works on the ramp slope. acdream's StepSphereDown/AdjustSphereToPlane are FAITHFUL to retail (verified vs find_walkable pc:326793 + adjust_sphere_to_plane pc:322032), so the obvious "set the CP anyway" fix DIVERGES from retail — do NOT ship it. NEXT STEP (ready): run tools/cdb/retail-flatfloor-trace.cdb on the live retail client at the cellar lip to see whether retail's step_sphere_down returns 3 (sets CP) or 1 (no CP) on the flat floor — that decides where retail establishes the flat-floor contact plane, then port it. 4 TEMP probes (gated on ACDREAM_PROBE_INDOOR_BSP, marked STRIP) are uncommitted in the worktree. The text below is HISTORY.

Canonical pickup, 2026-06-04. Branch claude/thirsty-goldberg-51bb9b (do NOT branch/worktree; do NOT push without asking; NEVER git stash/gc). PowerShell on Windows; launch logs are UTF-16.

🔴 RE-DIAGNOSED 2026-06-04 (acdream corner trace) — the cellar wedge is a MEMBERSHIP bug, NOT collision. The "## The cdb-pinned finding" below (retail steps up onto the floor) is correct for RETAIL, but instrumenting acdream (ACDREAM_DUMP_STEPUP=1) at the lip showed acdream's step-up WORKS: 518 attempts, 220 SUCCESS landing the candidate on the cottage floor (CheckPos Z=94.0, normal (0,0,1)), 298 FAILED, alternating. But the committed CurPos never advances — it stays on the ramp at (…,9.70,93.41); every success is REVERTED. [cell-transit] shows a cell-resolver ping-pong every tick at the 3-cell junction: 0xA9B40175↔0174↔0171, reason=resolver. So ResolveCellId flips the cell each frame → the floor-landing is validated against the wrong cell + rejected → revert → oscillation → wedge. NOT step-up (works), NOT edge-slide. It's the #98/"Finding-3" cell-ping-pong family. The fix is membership/ cell-resolution stability at the junction — the PARKED, approval-gated (a) ResolveCellId demotion/stickiness from the master plan (P1 claimed it was demoted out of the per-frame path, but this trace shows it's STILL driving per-frame cell changes here + unstable). The collision-side fixes (B1 abbd761, slide_sphere 0935a31) are correct + KEEP. Apparatus: acdream-corner-capture.jsonl + the stepup:/[cell-transit] lines in launch-acdream-corner.log. Next: pin whether the commit-rejection is caused by the resolver flip (trace ResolveWithTransition validate/commit vs the cell change at the lip), then stabilize membership there (do NOT touch step-up/slide — they work).

State both altitudes

  • Milestone: M1.5 — Indoor world feels right.
  • Phase: P2 (door / building-shell collision) of the verbatim spatial-pipeline port.
  • Shipped this session (committed, branch HEAD 0935a31):
    • abbd761B1 fix: Path 5 (Contact) near-miss dispatch ported verbatim — gate behind num_sphere > 1, head-first order, neg_step_up mapping (head→false/slide, foot→true/step-up). Retail transitional_insert/find_collisions Contact branch (acclient_2013_pseudo_c.txt:323838-323881, set_neg_poly_hit :323279). Fixed the B1 grounded-step-up wedge (the handoff's "climb" localization was WRONG — proved via ITestOutputHelper capture).
    • 0935a31slide_sphere fix: head near-miss (neg_step_up==0) now calls the faithful CSphere::slide_sphere (existing SlideSphereInternal) + continues the insert loop, replacing the A6.P4 Collided shortcut (transitional_insert pc:273350-273351).
    • f984e92 — docs (corrected the prior P2 handoff).
  • Visual-verified 2026-06-04: generic step-up climbs; closed cottage door still BLOCKS (slides tangentially, no walkthrough — regression check passed); cellar ascent went from ALWAYS-stuck → WORKS-MOSTLY.
  • Remaining: an intermittent corner-wedge at the cellar-top lip. Retail is always smooth there (user-confirmed). So it's a real bug.

The cdb-pinned finding (retail ground truth)

tools/cdb/cellar-corner-escape.cdb traced live retail at the cellar-top corner (decode: parse_corner_log.py; raw: cellar-corner-retail.log). Retail escapes the corner by STEP-UP, not slide:

  • step_sphere_upstep_up fired 196× vs only 38 near-misses. step_up normals: +X wall ×78, ceiling (0,0,-1) ×36, +Y wall ×32, X wall ×18, ramp slope (0,0.62,0.78) ×11, Y wall ×10, floor (0,0,1) ×10. So retail step-ups against EVERY grounded full-hit at the corner.
  • Contact plane transitions ramp N.z=0.78 (×63) → flat cottage floor N.z=1.0 (×76). That's the escape: retail climbs the lip off the ramp ONTO the cottage floor.
  • The user's "run in place against the ceiling (not stuck)" = step_up failing on the ceiling normal (0,0,-1)step_up_slide (transient; steer out).

Divergence pinned: retail escapes by stepping up onto the cottage floor; acdream slides at the lip and never makes the ramp→floor transition. The slide itself (the 0935a31 fix) is correct + working; the gap is the final lip-climb. This is the original #98 coreDoStepDown/step_sphere_down finding + landing on the cottage floor — which B1+slide got close to but didn't finish.

Next step (evidence-first — #98 saga rule: do NOT guess)

  1. Instrument acdream's OWN corner path. The captures so far (cellar-up-capture*.jsonl, door-recheck-capture.jsonl) have positions/normals but NOT the path. Need to answer: at the cellar-top lip, does acdream's step_sphere_upDoStepUp FIRE and FAIL to land on the cottage floor (DoStepDown can't find N.z=1.0 within StepUpHeight=0.6), or does it not fire (the hit goes to the slide path instead)? Relaunch acdream with ProbeBuildingEnabled (→ [neg-poly-dispatch]/ [bsp-test]) + ACDREAM_DUMP_STEPUP=1 + ProbeStepWalkEnabled (→ [step-walk]), reproduce the wedge, read the path. (xunit-swallow doesn't apply to the live app — Console probes DO surface in the launch log.)
  2. Compare to retail's 196 step_up / ramp→floor transition and port the missing lip-climb verbatim. Likely in DoStepDown (TransitionTypes.cs:3074) / BSPQuery.step_sphere_down (:1206) / find_walkable (:693) — the cottage-floor find+land. Retail anchors: CTransition::step_up pc:273099, step_down pc:272946, BSPTREE::step_sphere_down pc:323665, CObjCell::find_env_collisions (the walkable-refresh that overwrites the contact plane ramp→floor).
  3. USER VISUAL GATE: cellar ascent clean (no intermittent wedge); door still blocks; generic step-up still climbs.

Apparatus (committed / available)

  • tools/cdb/cellar-corner-escape.cdb — retail corner trace (step_up/step_sphere_up/ neg_poly_hit/contact_plane counts + args; 30K threshold — TOO HIGH for these lower-frequency BPs, lower to ~3000 next time so it auto-detaches in one wedge).
  • parse_corner_log.py — decodes the cdb log (hex→float, histograms).
  • Captures (UNCOMMITTED, in worktree root, ~32 MB each — do NOT commit): cellar-up-capture.jsonl (v1, pre-slide-fix wedge), cellar-up-capture-v2.jsonl (post-slide-fix: 96 hit-and-advanced slide frames), door-recheck-capture.jsonl, cellar-corner-retail.log (the retail cdb trace).
  • analyze_cellar.py / analyze_v2.py — ad-hoc capture analyzers (capture-specific).

Test baseline

Core 1310 pass / 4 fail / 1 skip. The 4 fails are pre-existing documents-the-bug / separate-issue: DoorCollisionApparatusTests.Apparatus_Grounded_50cmOffCenter (synthetic-test artifact — terrain=-1000, no queryable floor; NOT a real door-block failure — see memory/project_p2_door_stepup_findings.md), 2× DoorBugTrajectoryReplay LiveCompare_* (compare against captured-BUGGY-live positions; need re-baseline), and BSPStepUpTests.D4 (airborne Path 6 sliding-normal persistence — separate). App 177 green.

Do NOT

  • Guess (the #98 saga burned 10+ speculative fixes) — pin the mechanism with the apparatus first.
  • Add a ResolveCellId stickiness clamp / suppression flag — the user chose the principled P1 demotion, not a band-aid (no-workarounds rule).
  • Flip Apparatus_Grounded_50cmOffCenter to Assert.True(blocked) — it blocks via a synthetic-floor artifact, not a faithful door block.
  • Re-investigate B1 (abbd761) or slide_sphere (0935a31) — both shipped + verified + correct.

FRESH-SESSION KICKOFF PROMPT (copy-paste) — user-approved 2026-06-04: principled P1 membership fix

Continue the VERBATIM retail spatial-pipeline port for acdream. Branch claude/thirsty-goldberg-51bb9b
(do NOT branch/worktree; do NOT push without asking; NEVER git stash/gc). PowerShell on Windows;
launch logs are UTF-16.

STATE: M1.5 (Indoor world feels right). P2 COLLISION = DONE + shipped: B1 near-miss gate (abbd761) +
slide_sphere head-near-miss (0935a31). Generic step-up climbs; the closed cottage door BLOCKS (no
walkthrough); step-up AT THE CELLAR LIP works (220 successful candidate-landings on the cottage floor).
The remaining intermittent CELLAR-ASCENT WEDGE is RE-DIAGNOSED (live acdream + retail cdb traces) to a
MEMBERSHIP cell-resolver ping-pong — NOT collision. The user APPROVED the PRINCIPLED P1 fix (demote
ResolveCellId / swept curr_cell as per-frame authority), NOT a stickiness band-aid.

READ FIRST (in order):
1. docs/research/2026-06-04-p2-cellar-corner-stepup-handoff.md — RE-DIAGNOSIS banner + full evidence.
2. memory/project_p2_door_stepup_findings.md — RE-DIAGNOSIS 2026-06-04 entry + shipped fixes + do-not.
3. memory/project_retail_membership_criterion.md — P1 membership context (swept curr_cell pick).
4. docs/superpowers/specs/2026-06-03-verbatim-spatial-pipeline-port-master-plan.md — §A membership
   A1A9, §1 KEEP/REPLACE/DELETE (ResolveCellId -> spawn/teleport seed; per-frame from swept curr_cell),
   parked (a)(d).

THE FINDING (evidence): at the Holtburg cottage cellar-top lip (3-cell junction), acdream step-up
SUCCEEDS — lands CheckPos on the cottage floor (Z=94.0, normal (0,0,1)) 220/518 times, matching retail.
But committed CurPos never advances (stays on the ramp ~(…,9.70,93.41)); every success is REVERTED
because the cell PING-PONGS every tick (0xA9B40175<->0174<->0171, [cell-transit] reason=resolver) -> the
floor-landing is validated against the wrong cell + rejected. Retail (cdb) is smooth: step_up + contact
plane transitions ramp N.z=0.78 -> flat floor N.z=1.0 (76 landings), no cell ping-pong. This CONTRADICTS
P1's claim that ResolveCellId was demoted out of the per-frame path.

THE JOB (evidence-first; do NOT guess):
1. PIN the exact code path producing the per-frame [cell-transit] reason=resolver ping-pong at the lip
   (is it PhysicsEngine.ResolveCellId despite P1's demotion claim, the swept advance, or
   PlayerMovementController.UpdateCellId/UpdatePlayerCurrCell?), and CONFIRM the resolver flip CAUSES the
   step-up commit-rejection (re-validation against the flipped cell) vs being a symptom.
2. PORT THE PRINCIPLED P1 FIX: make the swept curr_cell (find_cell_list pick over the uniform candidate
   set) the per-frame membership authority at this junction; demote ResolveCellId to spawn/teleport seed.
   Retail anchors: A1 CObjCell::find_cell_list 0x52b4e0 pc:308742; A8 change_cell/SetPositionInternal
   0x513390/0x515330; A7 transitional_insert/validate_transition/check_other_cells. The cell must NOT
   flip out from under a committed step-up. NO stickiness band-aid.
3. RED->GREEN: deterministic test for the lip junction (cell stable after step-up) + keep B1/B2/B3/door
   tests green. USER VISUAL GATE: cellar ascent clean (no wedge); door still blocks; generic step-up climbs.

APPARATUS (in the worktree):
- acdream captures: acdream-corner-capture.jsonl (lip wedge: step-up-works + cell ping-pong),
  cellar-up-capture-v2.jsonl, cellar-up-capture.jsonl (JSON Lines, ACDREAM_CAPTURE_RESOLVE, IsPlayer).
- Retail cdb: cellar-corner-retail.log + tools/cdb/cellar-corner-escape.cdb. Decode: parse_corner_log.py
  / tools/cdb/decode_retail_hex.py.
- Probes: ACDREAM_PROBE_CELL=1 ([cell-transit]), ACDREAM_DUMP_STEPUP=1 (stepup:), ACDREAM_PROBE_RESOLVE=1
  ([resolve]), ACDREAM_CAPTURE_RESOLVE=<path>. Live launch per CLAUDE.md "Running the client".
- cdb on retail at the lip (break CObjCell::find_cell_list / change_cell / SetPositionInternal) if the
  decomp is ambiguous. PDB matches; tools/cdb/. Lower the trace threshold (~3000) so it auto-detaches in
  one wedge.

DO NOT: re-investigate B1/slide_sphere (shipped, correct); add a ResolveCellId stickiness/suppression
band-aid (user chose principled); flip Apparatus_Grounded_50cmOffCenter to Assert.True(blocked)
(synthetic-floor artifact); guess.

TEST BASELINE: Core 1310 pass / 4 fail / 1 skip (the 4: Apparatus_Grounded_50cmOffCenter [synthetic-floor
artifact], 2x DoorBugTrajectoryReplay LiveCompare_* [captured-buggy-live, re-baseline], BSPStepUpTests.D4
[airborne Path 6, separate]); App 177 green. Branch HEAD: 664101f (+ this commit).