docs(handoff): A6.P3 #98 Step 6 — issue + claude.md handoff

Final step of the apparatus plan. Updates ISSUES.md issue #98 and
CLAUDE.md's M1.5 status to reflect:

- The apparatus completed (Steps 1-5 land in commits 35b37df28c282a).
- The real divergence: retail's sphere is at world Z ≈ 94.48 (resting
  on cottage floor) when find_walkable accepts; acdream's failing-
  frame sphere is 2.47m lower at world Z ≈ 92.01.
- The four fix targets, in priority order. Fix plan is the NEXT plan,
  scoped to Target 1 (step-up + ramp climb Z gain) or Target 2
  (cottage-cell sphere reference).
- The replay harness (Issue98CellarUpReplayTests) is the test loop —
  any fix that doesn't change the failing assertions is not the fix.

Today's commit graph on top of slice 5 (cf3deff):
  35b37df  triage — revert neg-poly + bldg-check experiments
  f62a873  Step 2 — cell-dump probe + roundtrip test
  3f56915  Step 2 capture — 3 real-geometry cell fixtures
  856aa78  Step 3 — deterministic replay harness (7 tests)
  6f666c1  Step 4 — retail cdb find_walkable capture script
  28c282a  Step 5 — replay vs retail divergence comparison
  (this)   Step 6 — ISSUES.md + CLAUDE.md handoff

Test baseline: 1167 + 8 (8 pre-existing failures, +19 new passing
tests across the apparatus). Build green throughout.

A6.P3 #98 is now in evidence-driven mode. Fix plan starts from the
divergence doc at
docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md.

Pickup prompt for the fix-plan session is in §"Pickup prompt for the
fix plan" of that doc.
This commit is contained in:
Erik 2026-05-23 15:58:52 +02:00
parent 28c282a563
commit 67005e21f1
2 changed files with 47 additions and 15 deletions

View file

@ -744,16 +744,31 @@ indoor branch (point-in check against `fallbackCellId`'s CellBSP
before falling through to FindCellList). Data confirms ping-pong is
FULLY CLOSED — scen4 cellar capture shows 1 cell-transit (login
teleport) vs 20+ pre-fix. **#90 workaround now redundant — deferred
to A6.P4 removal. #98 SHARP DIAGNOSIS LANDED 2026-05-22** (commits
`134c9b8` retail capture + `efb5f2c` issue update): paired retail+acdream
cdb captures show our BSPQuery.FindCollisions picks Path 5 (Contact
step_up push-back) for the cellar ramp polygon when retail picks
Path 6 (find_walkable land). Cellar-up bug is NOT cell-resolver, NOT
walk_interp, NOT dat-fidelity — it's BSP path-selection. Handoff doc
at [`docs/research/2026-05-22-a6-p3-handoff.md`](docs/research/2026-05-22-a6-p3-handoff.md)
with concrete next-session steps + pickup prompt. Current A6 phase:
**A6.P3 slice 4 — fix issue #98 BSP path-selection in
BSPQuery.FindCollisions dispatcher.**
to A6.P4 removal. #98 APPARATUS COMPLETE 2026-05-23 evening**
(commits `35b37df` triage → `f62a873` cell-dump probe → `3f56915`
fixtures → `856aa78` replay harness → `6f666c1` cdb script →
`28c282a` divergence comparison doc). Four sessions of speculative
fixes (10+ variants) shipped the wrong diagnosis each time; this
session shipped the APPARATUS that turns evidence-driven analysis
into a 200ms test loop. Real divergence: retail's sphere is at
world Z ≈ 94.48 (resting on cottage floor) when find_walkable
accepts; acdream's failing-frame sphere is at world Z ≈ 92.01
(2.47m lower). Retail's ContactPlane writes during cellar-up are
ONLY flat floors (cellar floor or cottage floor), never the ramp.
Retail's find_crossed_edge fires once in 35K BPs; ours uses it
heavily. **Fix targets (priority): (1) Transition.AdjustOffset
slope projection / DoStepUp WalkInterp handling — ramp climb
doesn't gain Z; (2) cottage-cell candidacy using wrong sphere
reference; (3) find_crossed_edge over-use; (4) ramp polygon normal
divergence (low confidence).** Full divergence reading +
fix-plan pickup prompt at
[`docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md`](docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md).
Current A6 phase:
**A6.P3 — write evidence-driven fix plan against Target 1 or 2;
NO speculative fixes (4 prior sessions guessed and were wrong).
The replay harness ([`tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs`](tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs))
is the test loop; a fix that doesn't change the failing assertions
is not the fix.**
Findings doc:
[`docs/research/2026-05-21-a6-cdb-capture-findings.md`](docs/research/2026-05-21-a6-cdb-capture-findings.md).
Original demo scenario (Holtburg Sewer end-to-end) is unreachable: sewer

View file

@ -698,12 +698,29 @@ BUT the cellar-up symptom PERSISTS even with the cell-resolver fix. The remainin
Full slice 5 evidence + sharpened next-step pickup at [`docs/research/2026-05-22-a6-p3-slice5-handoff.md`](docs/research/2026-05-22-a6-p3-slice5-handoff.md). Capture data at `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/`.
**The fix target is cell-resolver behavior at the cellar/cottage boundary**, NOT `BSPQuery.FindCollisions` path-selection. Likely changes in `PhysicsEngine.ResolveCellId` (tiebreaker for sphere spanning two cells) or `Transition.TransitionalInsert` (re-resolve cell between iterations when CheckPos has moved significantly).
**Diagnosis FINALIZED 2026-05-23 evening** (commit `28c282a`, divergence doc at [`docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md`](docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md)). After 4 sessions of speculative fixes (10+ variants, none worked), apparatus shipped to turn evidence-driven analysis into a 200ms test loop:
**Failed fix attempts during 2026-05-22 (informational):**
- WalkInterp reset before placement_insert (commit `bbd1df4`) — logical retail-faithful improvement but doesn't fix the cellar-up symptom. Keep in tree as small quality fix.
- Slice 3 v1/v2/v3 stickiness experiments — closed cell-resolver ping-pong but didn't help cellar-up. v3 reverted (commit `8bd3117`).
- Slice 5 (this session): no fix attempted — only diagnostic probe + sharpened diagnosis shipped. The "Path 5 vs Path 6" target was investigated and ruled out via cdb data.
- Deterministic replay harness: [`tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs`](tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs) loads the three cottage/cellar cell fixtures (captured live via the new `ACDREAM_DUMP_CELLS` probe) and drives the failing-frame sphere through our walkable predicates. 7 tests, all pass, all reproduce the live failure without a client launch.
- Retail comparison: [`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log`](docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log) — 35K cdb BP hits during the equivalent retail cellar-up.
**REAL divergence**: NOT cell-resolver. NOT path-selection. NOT polygon 0x0020 the cellar ceiling.
- Retail's sphere is at world Z ≈ **94.48** (resting on cottage floor) when `find_walkable` accepts the cottage main floor plane.
- Our failing-frame sphere is at world Z ≈ **92.01** (2.47m lower) when our walkable query rejects the cottage main floor.
- Retail's `ContactPlane` writes during cellar-up are ONLY flat horizontal planes (cellar floor Z=90.95 OR cottage floor Z=94.00). Never the ramp.
- Retail's `find_crossed_edge` fires ONCE in 35K BPs. Acdream uses it heavily.
**Fix targets** (priority order, from the comparison doc):
1. (HIGHEST) Step-up + ramp climb doesn't gain enough Z per tick. Retail climbs gradually across thousands of ticks; ours oscillates at Z≈92. Look at `Transition.AdjustOffset` slope projection + `Transition.DoStepUp` WalkInterp handling.
2. Cottage-cell candidacy uses wrong sphere reference (pre-step-up vs step-lifted center).
3. `find_crossed_edge` over-use in our walkable acceptance path.
4. (LOW) Ramp polygon normal divergence.
**Failed fix attempts (informational):**
- WalkInterp reset before placement_insert (commit `bbd1df4`) — logical retail-faithful improvement but doesn't fix the cellar-up symptom. Keep.
- Slice 3 v1/v2/v3 cell-resolver stickiness — closed ping-pong but didn't help cellar-up. v3 reverted (`8bd3117`).
- Slice 5: `[place-fail]` probe + diagnosis correction. Useful infrastructure; not a fix.
- Slice 6 (2026-05-22 PM): 6 placement-insert bypass variants. None unstuck the player.
- Slice 7 (2026-05-23 AM): terrain hole cutout, multi-sphere CellTransit, building bldg-check, negative-side polygon support, render-vs-physics origin split. Triaged in commit `35b37df`: kept render-physics split + multi-sphere CellTransit + diagnostic probes; reverted neg-poly + bldg-check (didn't fix #98).
**Related:**
- Inn stairs UP works (different geometry, doesn't trigger this specific failure mode)