acdream/docs/research/2026-06-04-p2-cellar-corner-stepup-handoff.md
Erik bc1be26907 test(p2): faithful cellar-lip wedge reproduction + investigation apparatus (no fix yet)
P2 / M1.5 "blocked at the last step" cellar-lip wedge. This session built a faithful
deterministic reproduction and peeled the cause through six evidence-disproven framings
to one bounded question. NO fix landed — the last layers were each disproven by evidence,
and guessing at the load-bearing collision code is the saga's failure mode.

Apparatus:
- CellarLipWedgeTests.cs + Fixtures/cellar-lip/ (3 real cell dumps + wedge-records.jsonl =
  29 captured ACDREAM_CAPTURE_RESOLVE wedge calls). Replays the exact calls + body-before
  through the lip-cell engine: all 29 reproduce at 0% advance in <200 ms. Tests are
  documents-the-bug / diagnostics (GREEN while the wedge exists).
- TEMP probes ([path5-wall]/[fw-enter]/[find-walkable] in BSPQuery; [neg-poly]/[stepsphereup]/
  [stepdown-decide]/CheckOtherCells cn/sn/negHit in TransitionTypes), gated on
  ACDREAM_PROBE_INDOOR_BSP, marked STRIP. TransitionTypes neg-poly shortcut has a reverted-fix
  comment (slide attempt didn't clear the wedge).
- tools/cdb/retail-*-trace.cdb (retail cdb traces).

Findings (handoff: docs/research/2026-06-04-p2-cellar-lip-flatfloor-cp-handoff.md, see the
"NEXT-SESSION KICKOFF" at top):
- Flat-floor contact plane is retail-faithful (v1 trace, full-file correlation). NOT the bug.
- PosHitsSphere cull sign is retail-faithful (cdb -z verified; the Binary Ninja `test ah,N; jp`
  parity-jump reads inverted — caught + reverted a wrong fix from that mis-read).
- Sphere radius correct (0.48 player / 0.30 camera probe).
- Retail connector cell 0xA9B40175 never blocks (CEnvCell::find_collisions trace: 0 Collided/Slid).
- PINNED: during the step-up's step-down, BSPQuery.FindWalkableInternal is never called for cell
  0171, so the cottage floor (poly 0x0023, Z=94) is never tested as walkable -> no contact plane
  -> step-up fails -> StepUpSlide=Collided -> wedge.

Next: trace FindEnvCollisions -> FindCollisions path dispatch for 0171 during StepDown=true (why
StepSphereDown/find_walkable is skipped), port retail, validate via CellarLipWedgeTests, regress
DoorBugTrajectoryReplayTests + visual gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 08:30:36 +02:00

190 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# P2 pickup — cellar-top corner wedge = cell-resolver ping-pong (re-diagnosed) reverting a WORKING step-up
> **🟢 SUPERSEDED 2026-06-04 PM — the wedge is NOT membership and NOT a reverted landing.**
> Canonical findings + full evidence chain are now in `memory/project_p2_door_stepup_findings.md`
> (the "RE-DIAGNOSIS 2" + "SLIDE LOCALIZED" + "FAILING CONDITION PINNED" entries). One-line summary:
> a live **retail cdb trace** proved retail's carried cell ALSO flips 0174/0175/0171 at the lip yet
> retail is smooth → membership ruled out. The wedge is a **step-up coin-flip**: the step-up's
> internal step-down FAILS to set a contact plane on the FLAT cottage floor (`cpValid=False`,
> `walkInterp=1.0`) while it works on the ramp slope. acdream's `StepSphereDown`/`AdjustSphereToPlane`
> are FAITHFUL to retail (verified vs `find_walkable` pc:326793 + `adjust_sphere_to_plane` pc:322032),
> so the obvious "set the CP anyway" fix DIVERGES from retail — do NOT ship it. **NEXT STEP (ready):**
> run `tools/cdb/retail-flatfloor-trace.cdb` on the live retail client at the cellar lip to see whether
> retail's `step_sphere_down` returns 3 (sets CP) or 1 (no CP) on the flat floor — that decides where
> retail establishes the flat-floor contact plane, then port it. 4 TEMP probes (gated on
> ACDREAM_PROBE_INDOOR_BSP, marked STRIP) are uncommitted in the worktree. The text below is HISTORY.
> **Canonical pickup, 2026-06-04.** Branch `claude/thirsty-goldberg-51bb9b` (do NOT
> branch/worktree; do NOT push without asking; NEVER `git stash`/`gc`). PowerShell on
> Windows; launch logs are UTF-16.
> **🔴 RE-DIAGNOSED 2026-06-04 (acdream corner trace) — the cellar wedge is a MEMBERSHIP
> bug, NOT collision.** The "## The cdb-pinned finding" below (retail steps up onto the
> floor) is correct for RETAIL, but instrumenting acdream (`ACDREAM_DUMP_STEPUP=1`) at the
> lip showed acdream's **step-up WORKS**: 518 attempts, **220 SUCCESS** landing the
> candidate on the cottage floor (`CheckPos Z=94.0`, normal `(0,0,1)`), 298 FAILED,
> alternating. But the **committed `CurPos` never advances** — it stays on the ramp at
> `(…,9.70,93.41)`; every success is REVERTED. `[cell-transit]` shows a **cell-resolver
> ping-pong every tick at the 3-cell junction: `0xA9B40175↔0174↔0171`, `reason=resolver`**.
> So `ResolveCellId` flips the cell each frame → the floor-landing is validated against the
> wrong cell + rejected → revert → oscillation → wedge. **NOT step-up (works), NOT
> edge-slide.** It's the #98/"Finding-3" cell-ping-pong family. **The fix is membership/
> cell-resolution stability at the junction — the PARKED, approval-gated (a) `ResolveCellId`
> demotion/stickiness from the master plan** (P1 claimed it was demoted out of the per-frame
> path, but this trace shows it's STILL driving per-frame cell changes here + unstable). The
> collision-side fixes (B1 `abbd761`, slide_sphere `0935a31`) are correct + KEEP. Apparatus:
> `acdream-corner-capture.jsonl` + the `stepup:`/`[cell-transit]` lines in
> `launch-acdream-corner.log`. **Next:** pin whether the commit-rejection is caused by the
> resolver flip (trace `ResolveWithTransition` validate/commit vs the cell change at the
> lip), then stabilize membership there (do NOT touch step-up/slide — they work).
## State both altitudes
- **Milestone:** M1.5 — Indoor world feels right.
- **Phase:** P2 (door / building-shell collision) of the verbatim spatial-pipeline port.
- **Shipped this session (committed, branch HEAD `0935a31`):**
- `abbd761`**B1 fix:** Path 5 (Contact) near-miss dispatch ported verbatim — gate
behind `num_sphere > 1`, head-first order, `neg_step_up` mapping (head→false/slide,
foot→true/step-up). Retail `transitional_insert`/`find_collisions` Contact branch
(`acclient_2013_pseudo_c.txt:323838-323881`, `set_neg_poly_hit` :323279). Fixed the
B1 grounded-step-up wedge (the handoff's "climb" localization was WRONG — proved via
`ITestOutputHelper` capture).
- `0935a31`**slide_sphere fix:** head near-miss (`neg_step_up==0`) now calls the
faithful `CSphere::slide_sphere` (existing `SlideSphereInternal`) + continues the
insert loop, replacing the A6.P4 `Collided` shortcut (`transitional_insert`
pc:273350-273351).
- `f984e92` — docs (corrected the prior P2 handoff).
- **Visual-verified 2026-06-04:** generic step-up climbs; **closed cottage door still
BLOCKS** (slides tangentially, no walkthrough — regression check passed); **cellar
ascent went from ALWAYS-stuck → WORKS-MOSTLY.**
- **Remaining:** an **intermittent corner-wedge** at the cellar-top lip. Retail is
always smooth there (user-confirmed). So it's a real bug.
## The cdb-pinned finding (retail ground truth)
`tools/cdb/cellar-corner-escape.cdb` traced live retail at the cellar-top corner
(decode: `parse_corner_log.py`; raw: `cellar-corner-retail.log`). Retail escapes the
corner by **STEP-UP, not slide:**
- `step_sphere_up``step_up` fired **196×** vs only **38 near-misses**. `step_up`
normals: +X wall ×78, **ceiling `(0,0,-1)` ×36**, +Y wall ×32, X wall ×18, ramp
slope `(0,0.62,0.78)` ×11, Y wall ×10, floor `(0,0,1)` ×10. So retail step-ups
against EVERY grounded full-hit at the corner.
- **Contact plane transitions ramp `N.z=0.78` (×63) → flat cottage floor `N.z=1.0`
(×76).** That's the escape: retail **climbs the lip off the ramp ONTO the cottage
floor.**
- The user's "run in place against the ceiling (not stuck)" = `step_up` failing on the
ceiling normal `(0,0,-1)``step_up_slide` (transient; steer out).
**Divergence pinned:** retail escapes by **stepping up onto the cottage floor**;
acdream **slides at the lip and never makes the ramp→floor transition**. The slide
itself (the `0935a31` fix) is correct + working; the gap is the **final lip-climb**.
This is the **original #98 core**`DoStepDown`/`step_sphere_down` finding + landing
on the cottage floor — which B1+slide got close to but didn't finish.
## Next step (evidence-first — #98 saga rule: do NOT guess)
1. **Instrument acdream's OWN corner path.** The captures so far
(`cellar-up-capture*.jsonl`, `door-recheck-capture.jsonl`) have positions/normals but
NOT the path. Need to answer: at the cellar-top lip, does acdream's `step_sphere_up`
`DoStepUp` FIRE and FAIL to land on the cottage floor (DoStepDown can't find
`N.z=1.0` within `StepUpHeight=0.6`), or does it not fire (the hit goes to the slide
path instead)? Relaunch acdream with `ProbeBuildingEnabled` (→ `[neg-poly-dispatch]`/
`[bsp-test]`) + `ACDREAM_DUMP_STEPUP=1` + `ProbeStepWalkEnabled` (→ `[step-walk]`),
reproduce the wedge, read the path. (xunit-swallow doesn't apply to the live app —
Console probes DO surface in the launch log.)
2. **Compare to retail's 196 step_up / ramp→floor transition** and port the missing
lip-climb verbatim. Likely in `DoStepDown` (`TransitionTypes.cs:3074`) /
`BSPQuery.step_sphere_down` (:1206) / `find_walkable` (:693) — the cottage-floor
find+land. Retail anchors: `CTransition::step_up` pc:273099, `step_down` pc:272946,
`BSPTREE::step_sphere_down` pc:323665, `CObjCell::find_env_collisions` (the
walkable-refresh that overwrites the contact plane ramp→floor).
3. **USER VISUAL GATE:** cellar ascent clean (no intermittent wedge); door still blocks;
generic step-up still climbs.
## Apparatus (committed / available)
- `tools/cdb/cellar-corner-escape.cdb` — retail corner trace (step_up/step_sphere_up/
neg_poly_hit/contact_plane counts + args; 30K threshold — TOO HIGH for these
lower-frequency BPs, lower to ~3000 next time so it auto-detaches in one wedge).
- `parse_corner_log.py` — decodes the cdb log (hex→float, histograms).
- Captures (UNCOMMITTED, in worktree root, ~32 MB each — do NOT commit):
`cellar-up-capture.jsonl` (v1, pre-slide-fix wedge), `cellar-up-capture-v2.jsonl`
(post-slide-fix: 96 hit-and-advanced slide frames), `door-recheck-capture.jsonl`,
`cellar-corner-retail.log` (the retail cdb trace).
- `analyze_cellar.py` / `analyze_v2.py` — ad-hoc capture analyzers (capture-specific).
## Test baseline
Core 1310 pass / 4 fail / 1 skip. The 4 fails are pre-existing documents-the-bug /
separate-issue: `DoorCollisionApparatusTests.Apparatus_Grounded_50cmOffCenter`
(synthetic-test artifact — terrain=-1000, no queryable floor; NOT a real door-block
failure — see `memory/project_p2_door_stepup_findings.md`), 2× `DoorBugTrajectoryReplay
LiveCompare_*` (compare against captured-BUGGY-live positions; need re-baseline), and
`BSPStepUpTests.D4` (airborne Path 6 sliding-normal persistence — separate). App 177 green.
## Do NOT
- Guess (the #98 saga burned 10+ speculative fixes) — pin the mechanism with the apparatus first.
- Add a `ResolveCellId` stickiness clamp / suppression flag — the user chose the **principled**
P1 demotion, not a band-aid (no-workarounds rule).
- Flip `Apparatus_Grounded_50cmOffCenter` to `Assert.True(blocked)` — it blocks via a
synthetic-floor artifact, not a faithful door block.
- Re-investigate B1 (`abbd761`) or slide_sphere (`0935a31`) — both shipped + verified + correct.
## FRESH-SESSION KICKOFF PROMPT (copy-paste) — user-approved 2026-06-04: principled P1 membership fix
```
Continue the VERBATIM retail spatial-pipeline port for acdream. Branch claude/thirsty-goldberg-51bb9b
(do NOT branch/worktree; do NOT push without asking; NEVER git stash/gc). PowerShell on Windows;
launch logs are UTF-16.
STATE: M1.5 (Indoor world feels right). P2 COLLISION = DONE + shipped: B1 near-miss gate (abbd761) +
slide_sphere head-near-miss (0935a31). Generic step-up climbs; the closed cottage door BLOCKS (no
walkthrough); step-up AT THE CELLAR LIP works (220 successful candidate-landings on the cottage floor).
The remaining intermittent CELLAR-ASCENT WEDGE is RE-DIAGNOSED (live acdream + retail cdb traces) to a
MEMBERSHIP cell-resolver ping-pong — NOT collision. The user APPROVED the PRINCIPLED P1 fix (demote
ResolveCellId / swept curr_cell as per-frame authority), NOT a stickiness band-aid.
READ FIRST (in order):
1. docs/research/2026-06-04-p2-cellar-corner-stepup-handoff.md — RE-DIAGNOSIS banner + full evidence.
2. memory/project_p2_door_stepup_findings.md — RE-DIAGNOSIS 2026-06-04 entry + shipped fixes + do-not.
3. memory/project_retail_membership_criterion.md — P1 membership context (swept curr_cell pick).
4. docs/superpowers/specs/2026-06-03-verbatim-spatial-pipeline-port-master-plan.md — §A membership
A1A9, §1 KEEP/REPLACE/DELETE (ResolveCellId -> spawn/teleport seed; per-frame from swept curr_cell),
parked (a)(d).
THE FINDING (evidence): at the Holtburg cottage cellar-top lip (3-cell junction), acdream step-up
SUCCEEDS — lands CheckPos on the cottage floor (Z=94.0, normal (0,0,1)) 220/518 times, matching retail.
But committed CurPos never advances (stays on the ramp ~(…,9.70,93.41)); every success is REVERTED
because the cell PING-PONGS every tick (0xA9B40175<->0174<->0171, [cell-transit] reason=resolver) -> the
floor-landing is validated against the wrong cell + rejected. Retail (cdb) is smooth: step_up + contact
plane transitions ramp N.z=0.78 -> flat floor N.z=1.0 (76 landings), no cell ping-pong. This CONTRADICTS
P1's claim that ResolveCellId was demoted out of the per-frame path.
THE JOB (evidence-first; do NOT guess):
1. PIN the exact code path producing the per-frame [cell-transit] reason=resolver ping-pong at the lip
(is it PhysicsEngine.ResolveCellId despite P1's demotion claim, the swept advance, or
PlayerMovementController.UpdateCellId/UpdatePlayerCurrCell?), and CONFIRM the resolver flip CAUSES the
step-up commit-rejection (re-validation against the flipped cell) vs being a symptom.
2. PORT THE PRINCIPLED P1 FIX: make the swept curr_cell (find_cell_list pick over the uniform candidate
set) the per-frame membership authority at this junction; demote ResolveCellId to spawn/teleport seed.
Retail anchors: A1 CObjCell::find_cell_list 0x52b4e0 pc:308742; A8 change_cell/SetPositionInternal
0x513390/0x515330; A7 transitional_insert/validate_transition/check_other_cells. The cell must NOT
flip out from under a committed step-up. NO stickiness band-aid.
3. RED->GREEN: deterministic test for the lip junction (cell stable after step-up) + keep B1/B2/B3/door
tests green. USER VISUAL GATE: cellar ascent clean (no wedge); door still blocks; generic step-up climbs.
APPARATUS (in the worktree):
- acdream captures: acdream-corner-capture.jsonl (lip wedge: step-up-works + cell ping-pong),
cellar-up-capture-v2.jsonl, cellar-up-capture.jsonl (JSON Lines, ACDREAM_CAPTURE_RESOLVE, IsPlayer).
- Retail cdb: cellar-corner-retail.log + tools/cdb/cellar-corner-escape.cdb. Decode: parse_corner_log.py
/ tools/cdb/decode_retail_hex.py.
- Probes: ACDREAM_PROBE_CELL=1 ([cell-transit]), ACDREAM_DUMP_STEPUP=1 (stepup:), ACDREAM_PROBE_RESOLVE=1
([resolve]), ACDREAM_CAPTURE_RESOLVE=<path>. Live launch per CLAUDE.md "Running the client".
- cdb on retail at the lip (break CObjCell::find_cell_list / change_cell / SetPositionInternal) if the
decomp is ambiguous. PDB matches; tools/cdb/. Lower the trace threshold (~3000) so it auto-detaches in
one wedge.
DO NOT: re-investigate B1/slide_sphere (shipped, correct); add a ResolveCellId stickiness/suppression
band-aid (user chose principled); flip Apparatus_Grounded_50cmOffCenter to Assert.True(blocked)
(synthetic-floor artifact); guess.
TEST BASELINE: Core 1310 pass / 4 fail / 1 skip (the 4: Apparatus_Grounded_50cmOffCenter [synthetic-floor
artifact], 2x DoorBugTrajectoryReplay LiveCompare_* [captured-buggy-live, re-baseline], BSPStepUpTests.D4
[airborne Path 6, separate]); App 177 green. Branch HEAD: 664101f (+ this commit).
```