docs(handoff): A6.P3 #98 Step 6 — issue + claude.md handoff

Final step of the apparatus plan. Updates ISSUES.md issue #98 and
CLAUDE.md's M1.5 status to reflect:

- The apparatus completed (Steps 1-5 land in commits 35b37df28c282a).
- The real divergence: retail's sphere is at world Z ≈ 94.48 (resting
  on cottage floor) when find_walkable accepts; acdream's failing-
  frame sphere is 2.47m lower at world Z ≈ 92.01.
- The four fix targets, in priority order. Fix plan is the NEXT plan,
  scoped to Target 1 (step-up + ramp climb Z gain) or Target 2
  (cottage-cell sphere reference).
- The replay harness (Issue98CellarUpReplayTests) is the test loop —
  any fix that doesn't change the failing assertions is not the fix.

Today's commit graph on top of slice 5 (cf3deff):
  35b37df  triage — revert neg-poly + bldg-check experiments
  f62a873  Step 2 — cell-dump probe + roundtrip test
  3f56915  Step 2 capture — 3 real-geometry cell fixtures
  856aa78  Step 3 — deterministic replay harness (7 tests)
  6f666c1  Step 4 — retail cdb find_walkable capture script
  28c282a  Step 5 — replay vs retail divergence comparison
  (this)   Step 6 — ISSUES.md + CLAUDE.md handoff

Test baseline: 1167 + 8 (8 pre-existing failures, +19 new passing
tests across the apparatus). Build green throughout.

A6.P3 #98 is now in evidence-driven mode. Fix plan starts from the
divergence doc at
docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md.

Pickup prompt for the fix-plan session is in §"Pickup prompt for the
fix plan" of that doc.
This commit is contained in:
Erik 2026-05-23 15:58:52 +02:00
parent 28c282a563
commit 67005e21f1
2 changed files with 47 additions and 15 deletions

View file

@ -698,12 +698,29 @@ BUT the cellar-up symptom PERSISTS even with the cell-resolver fix. The remainin
Full slice 5 evidence + sharpened next-step pickup at [`docs/research/2026-05-22-a6-p3-slice5-handoff.md`](docs/research/2026-05-22-a6-p3-slice5-handoff.md). Capture data at `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/`.
**The fix target is cell-resolver behavior at the cellar/cottage boundary**, NOT `BSPQuery.FindCollisions` path-selection. Likely changes in `PhysicsEngine.ResolveCellId` (tiebreaker for sphere spanning two cells) or `Transition.TransitionalInsert` (re-resolve cell between iterations when CheckPos has moved significantly).
**Diagnosis FINALIZED 2026-05-23 evening** (commit `28c282a`, divergence doc at [`docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md`](docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md)). After 4 sessions of speculative fixes (10+ variants, none worked), apparatus shipped to turn evidence-driven analysis into a 200ms test loop:
**Failed fix attempts during 2026-05-22 (informational):**
- WalkInterp reset before placement_insert (commit `bbd1df4`) — logical retail-faithful improvement but doesn't fix the cellar-up symptom. Keep in tree as small quality fix.
- Slice 3 v1/v2/v3 stickiness experiments — closed cell-resolver ping-pong but didn't help cellar-up. v3 reverted (commit `8bd3117`).
- Slice 5 (this session): no fix attempted — only diagnostic probe + sharpened diagnosis shipped. The "Path 5 vs Path 6" target was investigated and ruled out via cdb data.
- Deterministic replay harness: [`tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs`](tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs) loads the three cottage/cellar cell fixtures (captured live via the new `ACDREAM_DUMP_CELLS` probe) and drives the failing-frame sphere through our walkable predicates. 7 tests, all pass, all reproduce the live failure without a client launch.
- Retail comparison: [`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log`](docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log) — 35K cdb BP hits during the equivalent retail cellar-up.
**REAL divergence**: NOT cell-resolver. NOT path-selection. NOT polygon 0x0020 the cellar ceiling.
- Retail's sphere is at world Z ≈ **94.48** (resting on cottage floor) when `find_walkable` accepts the cottage main floor plane.
- Our failing-frame sphere is at world Z ≈ **92.01** (2.47m lower) when our walkable query rejects the cottage main floor.
- Retail's `ContactPlane` writes during cellar-up are ONLY flat horizontal planes (cellar floor Z=90.95 OR cottage floor Z=94.00). Never the ramp.
- Retail's `find_crossed_edge` fires ONCE in 35K BPs. Acdream uses it heavily.
**Fix targets** (priority order, from the comparison doc):
1. (HIGHEST) Step-up + ramp climb doesn't gain enough Z per tick. Retail climbs gradually across thousands of ticks; ours oscillates at Z≈92. Look at `Transition.AdjustOffset` slope projection + `Transition.DoStepUp` WalkInterp handling.
2. Cottage-cell candidacy uses wrong sphere reference (pre-step-up vs step-lifted center).
3. `find_crossed_edge` over-use in our walkable acceptance path.
4. (LOW) Ramp polygon normal divergence.
**Failed fix attempts (informational):**
- WalkInterp reset before placement_insert (commit `bbd1df4`) — logical retail-faithful improvement but doesn't fix the cellar-up symptom. Keep.
- Slice 3 v1/v2/v3 cell-resolver stickiness — closed ping-pong but didn't help cellar-up. v3 reverted (`8bd3117`).
- Slice 5: `[place-fail]` probe + diagnosis correction. Useful infrastructure; not a fix.
- Slice 6 (2026-05-22 PM): 6 placement-insert bypass variants. None unstuck the player.
- Slice 7 (2026-05-23 AM): terrain hole cutout, multi-sphere CellTransit, building bldg-check, negative-side polygon support, render-vs-physics origin split. Triaged in commit `35b37df`: kept render-physics split + multi-sphere CellTransit + diagnostic probes; reverted neg-poly + bldg-check (didn't fix #98).
**Related:**
- Inn stairs UP works (different geometry, doesn't trigger this specific failure mode)