acdream/docs/research/2026-06-04-p2-cellar-corner-stepup-handoff.md
Erik 57435e912b docs(p2): fresh-session kickoff prompt — principled P1 membership fix (user-approved)
Appends the copy-paste kickoff prompt for the next session: pursue the principled
P1 fix for the cellar-lip cell-resolver ping-pong (demote ResolveCellId / make the
swept curr_cell the per-frame membership authority), NOT a stickiness band-aid.
Captures the evidence, apparatus, retail anchors, do-not list, and test baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 11:43:11 +02:00

175 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# P2 pickup — cellar-top corner wedge = cell-resolver ping-pong (re-diagnosed) reverting a WORKING step-up
> **Canonical pickup, 2026-06-04.** Branch `claude/thirsty-goldberg-51bb9b` (do NOT
> branch/worktree; do NOT push without asking; NEVER `git stash`/`gc`). PowerShell on
> Windows; launch logs are UTF-16.
> **🔴 RE-DIAGNOSED 2026-06-04 (acdream corner trace) — the cellar wedge is a MEMBERSHIP
> bug, NOT collision.** The "## The cdb-pinned finding" below (retail steps up onto the
> floor) is correct for RETAIL, but instrumenting acdream (`ACDREAM_DUMP_STEPUP=1`) at the
> lip showed acdream's **step-up WORKS**: 518 attempts, **220 SUCCESS** landing the
> candidate on the cottage floor (`CheckPos Z=94.0`, normal `(0,0,1)`), 298 FAILED,
> alternating. But the **committed `CurPos` never advances** — it stays on the ramp at
> `(…,9.70,93.41)`; every success is REVERTED. `[cell-transit]` shows a **cell-resolver
> ping-pong every tick at the 3-cell junction: `0xA9B40175↔0174↔0171`, `reason=resolver`**.
> So `ResolveCellId` flips the cell each frame → the floor-landing is validated against the
> wrong cell + rejected → revert → oscillation → wedge. **NOT step-up (works), NOT
> edge-slide.** It's the #98/"Finding-3" cell-ping-pong family. **The fix is membership/
> cell-resolution stability at the junction — the PARKED, approval-gated (a) `ResolveCellId`
> demotion/stickiness from the master plan** (P1 claimed it was demoted out of the per-frame
> path, but this trace shows it's STILL driving per-frame cell changes here + unstable). The
> collision-side fixes (B1 `abbd761`, slide_sphere `0935a31`) are correct + KEEP. Apparatus:
> `acdream-corner-capture.jsonl` + the `stepup:`/`[cell-transit]` lines in
> `launch-acdream-corner.log`. **Next:** pin whether the commit-rejection is caused by the
> resolver flip (trace `ResolveWithTransition` validate/commit vs the cell change at the
> lip), then stabilize membership there (do NOT touch step-up/slide — they work).
## State both altitudes
- **Milestone:** M1.5 — Indoor world feels right.
- **Phase:** P2 (door / building-shell collision) of the verbatim spatial-pipeline port.
- **Shipped this session (committed, branch HEAD `0935a31`):**
- `abbd761`**B1 fix:** Path 5 (Contact) near-miss dispatch ported verbatim — gate
behind `num_sphere > 1`, head-first order, `neg_step_up` mapping (head→false/slide,
foot→true/step-up). Retail `transitional_insert`/`find_collisions` Contact branch
(`acclient_2013_pseudo_c.txt:323838-323881`, `set_neg_poly_hit` :323279). Fixed the
B1 grounded-step-up wedge (the handoff's "climb" localization was WRONG — proved via
`ITestOutputHelper` capture).
- `0935a31`**slide_sphere fix:** head near-miss (`neg_step_up==0`) now calls the
faithful `CSphere::slide_sphere` (existing `SlideSphereInternal`) + continues the
insert loop, replacing the A6.P4 `Collided` shortcut (`transitional_insert`
pc:273350-273351).
- `f984e92` — docs (corrected the prior P2 handoff).
- **Visual-verified 2026-06-04:** generic step-up climbs; **closed cottage door still
BLOCKS** (slides tangentially, no walkthrough — regression check passed); **cellar
ascent went from ALWAYS-stuck → WORKS-MOSTLY.**
- **Remaining:** an **intermittent corner-wedge** at the cellar-top lip. Retail is
always smooth there (user-confirmed). So it's a real bug.
## The cdb-pinned finding (retail ground truth)
`tools/cdb/cellar-corner-escape.cdb` traced live retail at the cellar-top corner
(decode: `parse_corner_log.py`; raw: `cellar-corner-retail.log`). Retail escapes the
corner by **STEP-UP, not slide:**
- `step_sphere_up``step_up` fired **196×** vs only **38 near-misses**. `step_up`
normals: +X wall ×78, **ceiling `(0,0,-1)` ×36**, +Y wall ×32, X wall ×18, ramp
slope `(0,0.62,0.78)` ×11, Y wall ×10, floor `(0,0,1)` ×10. So retail step-ups
against EVERY grounded full-hit at the corner.
- **Contact plane transitions ramp `N.z=0.78` (×63) → flat cottage floor `N.z=1.0`
(×76).** That's the escape: retail **climbs the lip off the ramp ONTO the cottage
floor.**
- The user's "run in place against the ceiling (not stuck)" = `step_up` failing on the
ceiling normal `(0,0,-1)``step_up_slide` (transient; steer out).
**Divergence pinned:** retail escapes by **stepping up onto the cottage floor**;
acdream **slides at the lip and never makes the ramp→floor transition**. The slide
itself (the `0935a31` fix) is correct + working; the gap is the **final lip-climb**.
This is the **original #98 core**`DoStepDown`/`step_sphere_down` finding + landing
on the cottage floor — which B1+slide got close to but didn't finish.
## Next step (evidence-first — #98 saga rule: do NOT guess)
1. **Instrument acdream's OWN corner path.** The captures so far
(`cellar-up-capture*.jsonl`, `door-recheck-capture.jsonl`) have positions/normals but
NOT the path. Need to answer: at the cellar-top lip, does acdream's `step_sphere_up`
`DoStepUp` FIRE and FAIL to land on the cottage floor (DoStepDown can't find
`N.z=1.0` within `StepUpHeight=0.6`), or does it not fire (the hit goes to the slide
path instead)? Relaunch acdream with `ProbeBuildingEnabled` (→ `[neg-poly-dispatch]`/
`[bsp-test]`) + `ACDREAM_DUMP_STEPUP=1` + `ProbeStepWalkEnabled` (→ `[step-walk]`),
reproduce the wedge, read the path. (xunit-swallow doesn't apply to the live app —
Console probes DO surface in the launch log.)
2. **Compare to retail's 196 step_up / ramp→floor transition** and port the missing
lip-climb verbatim. Likely in `DoStepDown` (`TransitionTypes.cs:3074`) /
`BSPQuery.step_sphere_down` (:1206) / `find_walkable` (:693) — the cottage-floor
find+land. Retail anchors: `CTransition::step_up` pc:273099, `step_down` pc:272946,
`BSPTREE::step_sphere_down` pc:323665, `CObjCell::find_env_collisions` (the
walkable-refresh that overwrites the contact plane ramp→floor).
3. **USER VISUAL GATE:** cellar ascent clean (no intermittent wedge); door still blocks;
generic step-up still climbs.
## Apparatus (committed / available)
- `tools/cdb/cellar-corner-escape.cdb` — retail corner trace (step_up/step_sphere_up/
neg_poly_hit/contact_plane counts + args; 30K threshold — TOO HIGH for these
lower-frequency BPs, lower to ~3000 next time so it auto-detaches in one wedge).
- `parse_corner_log.py` — decodes the cdb log (hex→float, histograms).
- Captures (UNCOMMITTED, in worktree root, ~32 MB each — do NOT commit):
`cellar-up-capture.jsonl` (v1, pre-slide-fix wedge), `cellar-up-capture-v2.jsonl`
(post-slide-fix: 96 hit-and-advanced slide frames), `door-recheck-capture.jsonl`,
`cellar-corner-retail.log` (the retail cdb trace).
- `analyze_cellar.py` / `analyze_v2.py` — ad-hoc capture analyzers (capture-specific).
## Test baseline
Core 1310 pass / 4 fail / 1 skip. The 4 fails are pre-existing documents-the-bug /
separate-issue: `DoorCollisionApparatusTests.Apparatus_Grounded_50cmOffCenter`
(synthetic-test artifact — terrain=-1000, no queryable floor; NOT a real door-block
failure — see `memory/project_p2_door_stepup_findings.md`), 2× `DoorBugTrajectoryReplay
LiveCompare_*` (compare against captured-BUGGY-live positions; need re-baseline), and
`BSPStepUpTests.D4` (airborne Path 6 sliding-normal persistence — separate). App 177 green.
## Do NOT
- Guess (the #98 saga burned 10+ speculative fixes) — pin the mechanism with the apparatus first.
- Add a `ResolveCellId` stickiness clamp / suppression flag — the user chose the **principled**
P1 demotion, not a band-aid (no-workarounds rule).
- Flip `Apparatus_Grounded_50cmOffCenter` to `Assert.True(blocked)` — it blocks via a
synthetic-floor artifact, not a faithful door block.
- Re-investigate B1 (`abbd761`) or slide_sphere (`0935a31`) — both shipped + verified + correct.
## FRESH-SESSION KICKOFF PROMPT (copy-paste) — user-approved 2026-06-04: principled P1 membership fix
```
Continue the VERBATIM retail spatial-pipeline port for acdream. Branch claude/thirsty-goldberg-51bb9b
(do NOT branch/worktree; do NOT push without asking; NEVER git stash/gc). PowerShell on Windows;
launch logs are UTF-16.
STATE: M1.5 (Indoor world feels right). P2 COLLISION = DONE + shipped: B1 near-miss gate (abbd761) +
slide_sphere head-near-miss (0935a31). Generic step-up climbs; the closed cottage door BLOCKS (no
walkthrough); step-up AT THE CELLAR LIP works (220 successful candidate-landings on the cottage floor).
The remaining intermittent CELLAR-ASCENT WEDGE is RE-DIAGNOSED (live acdream + retail cdb traces) to a
MEMBERSHIP cell-resolver ping-pong — NOT collision. The user APPROVED the PRINCIPLED P1 fix (demote
ResolveCellId / swept curr_cell as per-frame authority), NOT a stickiness band-aid.
READ FIRST (in order):
1. docs/research/2026-06-04-p2-cellar-corner-stepup-handoff.md — RE-DIAGNOSIS banner + full evidence.
2. memory/project_p2_door_stepup_findings.md — RE-DIAGNOSIS 2026-06-04 entry + shipped fixes + do-not.
3. memory/project_retail_membership_criterion.md — P1 membership context (swept curr_cell pick).
4. docs/superpowers/specs/2026-06-03-verbatim-spatial-pipeline-port-master-plan.md — §A membership
A1A9, §1 KEEP/REPLACE/DELETE (ResolveCellId -> spawn/teleport seed; per-frame from swept curr_cell),
parked (a)(d).
THE FINDING (evidence): at the Holtburg cottage cellar-top lip (3-cell junction), acdream step-up
SUCCEEDS — lands CheckPos on the cottage floor (Z=94.0, normal (0,0,1)) 220/518 times, matching retail.
But committed CurPos never advances (stays on the ramp ~(…,9.70,93.41)); every success is REVERTED
because the cell PING-PONGS every tick (0xA9B40175<->0174<->0171, [cell-transit] reason=resolver) -> the
floor-landing is validated against the wrong cell + rejected. Retail (cdb) is smooth: step_up + contact
plane transitions ramp N.z=0.78 -> flat floor N.z=1.0 (76 landings), no cell ping-pong. This CONTRADICTS
P1's claim that ResolveCellId was demoted out of the per-frame path.
THE JOB (evidence-first; do NOT guess):
1. PIN the exact code path producing the per-frame [cell-transit] reason=resolver ping-pong at the lip
(is it PhysicsEngine.ResolveCellId despite P1's demotion claim, the swept advance, or
PlayerMovementController.UpdateCellId/UpdatePlayerCurrCell?), and CONFIRM the resolver flip CAUSES the
step-up commit-rejection (re-validation against the flipped cell) vs being a symptom.
2. PORT THE PRINCIPLED P1 FIX: make the swept curr_cell (find_cell_list pick over the uniform candidate
set) the per-frame membership authority at this junction; demote ResolveCellId to spawn/teleport seed.
Retail anchors: A1 CObjCell::find_cell_list 0x52b4e0 pc:308742; A8 change_cell/SetPositionInternal
0x513390/0x515330; A7 transitional_insert/validate_transition/check_other_cells. The cell must NOT
flip out from under a committed step-up. NO stickiness band-aid.
3. RED->GREEN: deterministic test for the lip junction (cell stable after step-up) + keep B1/B2/B3/door
tests green. USER VISUAL GATE: cellar ascent clean (no wedge); door still blocks; generic step-up climbs.
APPARATUS (in the worktree):
- acdream captures: acdream-corner-capture.jsonl (lip wedge: step-up-works + cell ping-pong),
cellar-up-capture-v2.jsonl, cellar-up-capture.jsonl (JSON Lines, ACDREAM_CAPTURE_RESOLVE, IsPlayer).
- Retail cdb: cellar-corner-retail.log + tools/cdb/cellar-corner-escape.cdb. Decode: parse_corner_log.py
/ tools/cdb/decode_retail_hex.py.
- Probes: ACDREAM_PROBE_CELL=1 ([cell-transit]), ACDREAM_DUMP_STEPUP=1 (stepup:), ACDREAM_PROBE_RESOLVE=1
([resolve]), ACDREAM_CAPTURE_RESOLVE=<path>. Live launch per CLAUDE.md "Running the client".
- cdb on retail at the lip (break CObjCell::find_cell_list / change_cell / SetPositionInternal) if the
decomp is ambiguous. PDB matches; tools/cdb/. Lower the trace threshold (~3000) so it auto-detaches in
one wedge.
DO NOT: re-investigate B1/slide_sphere (shipped, correct); add a ResolveCellId stickiness/suppression
band-aid (user chose principled); flip Apparatus_Grounded_50cmOffCenter to Assert.True(blocked)
(synthetic-floor artifact); guess.
TEST BASELINE: Core 1310 pass / 4 fail / 1 skip (the 4: Apparatus_Grounded_50cmOffCenter [synthetic-floor
artifact], 2x DoorBugTrajectoryReplay LiveCompare_* [captured-buggy-live, re-baseline], BSPStepUpTests.D4
[airborne Path 6, separate]); App 177 green. Branch HEAD: 664101f (+ this commit).
```