acdream/docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md
Erik ec47159a2e docs(handoff): A6.P3 #98 — full-session handoff doc + CLAUDE.md/ISSUES.md updates
Adds the canonical pickup document
docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md with:
- TL;DR + session arc (10 commits chronological)
- What the trajectory replay harness IS (committed apparatus)
- Bug 1 status: #98 cellar-up freeze (unfixed, 6 fix shapes failed)
- Bug 2 status: airborne-at-tick-1 (new, 6 hypotheses tested, root
  cause not isolated)
- Exclusion list: DO NOT retry any of the 6+6 dead ends
- Apparatus inventory: probes, tests, fixtures, cdb captures
- Recommended next move: side-by-side comparison harness against
  live PlayerMovementController state (evidence-first instead of
  speculation-first)
- Alternative moves: pivot to other M1.5 issues or M2 prep
- Self-contained pickup prompt at the bottom of the handoff doc

Updates CLAUDE.md's "Current A6 phase" block to point at the new
handoff doc as the canonical resume artifact.

Updates ISSUES.md's #98 entry with the late-day extension findings,
the 6-hypothesis exclusion list, and a pointer to the handoff doc.

Test baseline maintained at 1172 + 8 pre-existing failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 19:09:00 +02:00

14 KiB

A6.P3 #98 — Trajectory Replay Harness handoff

Session: 2026-05-23 (full day, 10+ commits) Worktree: C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c Branch: claude/strange-albattani-3fc83c

This handoff documents the apparatus committed this session, the things we learned, the things we ruled out, and the concrete next-session pickup move. Read this first when you resume.


TL;DR

  • #98 is NOT fixed. Six fix-shape attempts across this saga (4 prior sessions + 1 this session's Shape 1) all failed or got reverted.
  • The trajectory replay harness is REAL but blocked. Mechanically works — runs 200 physics ticks in <100 ms against pre-loaded cell fixtures. Blocked on a NEW second bug we surfaced during harness commissioning (airborne-at-tick-1).
  • The cellar ramp polygon is NOT in the cell — it's in a separate GfxObj (a static building piece) registered as a ShadowEntry. The harness reconstructs the ramp polygon programmatically from the live capture's polydump data.
  • Per the systematic-debugging skill: 6 hypotheses tested without convergence = stop and reflect. The next-session move is NOT another speculative fix attempt — it's a side-by-side comparison harness against live PlayerMovementController state.

What ran this session (chronological, 10 commits)

Commit What
8a232a3 [step-walk-adjust] probe inside Transition.AdjustOffset — names which projection branch fires per call + Z gain
8daf7e7 Findings note + capture snapshot. AdjustOffset projection is CORRECT — sphere climbs 90.95 → 92.80 monotonically. Caps at top of ramp because step-up rejects (cottage floor is ABOVE not below).
0cb4c59 Shape 1 fix attempt: gate BSPQuery.AdjustSphereToPlane's two SetContactPlane call sites by worldNormal.Z >= 0.99.
402ec10 Revert Shape 1 — broke OnWalkable for all sloped walkable surfaces (74% of live capture lines in falling state).
5f3b64c Session-pause handoff in ISSUES.md + CLAUDE.md.
4c9290c Trajectory replay harness (PhysicsEngine + PhysicsDataCache + PhysicsBody + cell fixtures). Mechanics validated.
3d2d10b Harness extension: programmatic synthetic stair GfxObj + ShadowEntry. Discovery: ramp polygon lives in GfxObj, not cell.
227a775 Diagnostic dump + 0.05m initial Z lift experiment. Same airborne behavior.
5c6bdbe Deep investigation: 6 hypotheses tested via the harness, none isolated root cause of (0,1,0) hit at tick 1.

What the harness IS (committed apparatus)

tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs

A deterministic trajectory replay that:

  1. Loads three issue-#98 cell fixtures (cellar + 2 cottage neighbors) via CellDumpSerializer.Hydrate.
  2. Wraps each cell with a synthetic single-leaf PhysicsBSPTree (AttachSyntheticBsp) — needed because Hydrate sets BSP=null and without BSP the indoor branch is skipped.
  3. Registers the cellar's stair-ramp polygon as a synthetic GfxObjPhysics (RegisterStairRampGfxObj) — polygon vertices in WORLD coordinates so the ShadowEntry registers at origin with identity rotation/scale.
  4. Constructs a PhysicsBody seeded with:
    • ContactPlaneValid=true, ContactPlane=(0,0,1,-90.95) (cellar floor plane)
    • WalkablePolygonValid=true, WalkableVertices = cellar floor poly under sphere XY
    • TransientState = Contact | OnWalkable
  5. Drives N ticks of PhysicsEngine.ResolveWithTransition with a constant -Y forward offset (PerTickOffset = (0, -0.1, 0)).
  6. Returns a per-tick TrajectoryPoint list (Tick, Position, CellId, IsOnGround, CpValid).

5 tests, all passing in ~75 ms total. Baseline maintained at 1167 + 5 (harness) = 1172 + 8 pre-existing failures.

Reusable helpers in the harness

Helper Purpose
BuildEngineWithCellarFixtures() Full engine setup — cells + synthetic BSPs + (optional) stair GfxObj
AttachSyntheticBsp(CellPhysics) Wraps a hydrated cell with a one-leaf BSP referencing every Resolved polygon. Reusable for any indoor-cell test that needs the indoor BSP path to fire.
RegisterStairRampGfxObj(engine, cache) Constructs a programmatic GfxObj + ShadowEntry for the cellar ramp polygon. Reusable for any indoor-static-collision test.
BuildInitialBody() PhysicsBody with both ContactPlane AND WalkablePolygon seeded. The seeding pattern is the discovery — both must be set or the engine treats the sphere as "grounded but anchorless."
SimulateTicks(engine, body, cellId, N) Per-tick driver with proper cross-tick PhysicsBody state.

Bug 1: #98 — cellar-up freeze (UNFIXED)

The original bug. Sphere climbs the cellar ramp partway (world Z 90.95 → 92.80) then caps. Cottage floor at world Z=94 still 1.2m above.

Refined diagnosis from this session's [step-walk-adjust] probe: AdjustOffset's slope projection is CORRECT — 145/146 calls take into-plane branch with mean +0.045 m zGain per call. The cap happens because step-up's downward step-down probe at the ramp top finds no walkable surface below (cottage floor is ABOVE). 101 stepdown-reject vs 1 acceptance.

Six fix shapes attempted across the saga, all failed:

  1. Placement-insert bypasses (slice 6, 6 variants)
  2. Cell-resolver tiebreaker changes (slice 3)
  3. Negative-side polygon handling (slice 7, reverted)
  4. Building-check / IsLandblockBuilding flag (slice 7, reverted)
  5. Multi-cell BSP iteration (A4, shipped but doesn't address top-of-ramp)
  6. Shape 1: gate ContactPlane assignment by Normal.Z ≥ 0.99 (this session — broke OnWalkable, reverted)

Bug 2: Airborne-at-tick-1 (NEW, surfaced this session)

When the trajectory replay harness drives ResolveWithTransition with a sphere seeded grounded on the cellar floor, tick 1 reports hit=yes n=(0,1,0) walkable=False/True and the body goes airborne. The sphere then floats horizontally over the cellar floor for the rest of the simulation, never touching the ramp.

This is structurally different from #98:

  • #98 fails MID-CLIMB at the top of the ramp
  • This bug fails AT START — sphere can't even walk a flat floor

This bug blocks the harness from reproducing #98 in test isolation. It must be solved before the harness can drive #98 fix attempts.

Confirmed via investigation (committed in 5c6bdbe)

Hypothesis Outcome
WalkablePolygon NOT seeded in body PARTIAL FIX — walkable=True survives but (0,1,0) hit still appears
Initial sphere Z lift 0.0 vs 0.05m NO — same hit either way
Synthetic stair GfxObj triggering wall hit NO — same hit without stair
Stub landblock terrain at Z=0 triggering hit NO — same hit without landblock
Cell BSP=null falling through to terrain NO — same hit with synthetic BSP attached
body=null vs body-with-CP-seed NO — same hit either way

What we know about the (0,1,0) hit

  • It's a +Y world normal — doesn't match any registered geometry (the stair has normal (0, 0.719, 0.695), the cellar floor has normal (0,0,1), the cellar walls have normal in the X/Y/Z axis directions but at known positions far from the sphere).
  • It appears at the after-validate step-walk probe site — set BY ValidateTransition between after-insert and after-validate.
  • ValidateTransition's default-fallback line sets UnitZ=(0,0,1), not UnitY=(0,1,0). So something INSIDE TransitionalInsert set ci.CollisionNormal=(0,1,0) before ValidateTransition ran.
  • 12 different SetCollisionNormal call sites in TransitionTypes.cs — root cause not isolated to one.

DO NOT DO (next session)

The 5-attempt-failure pattern from #98 saga + this session's 6-hypothesis-failure on the airborne bug = a long list of dead ends. Don't retry any of these:

For #98 itself:

  • Placement-insert bypasses in BSPQuery.FindCollisions / Transition.FindEnvCollisions / Transition.DoStepDown
  • Cell-resolver tiebreaker changes in PhysicsEngine.ResolveCellId (slice 3 already shipped a fix)
  • Negative-side polygon handling
  • bldg-check / IsLandblockBuilding flag propagation
  • Gating ContactPlane assignment by Normal.Z in BSPQuery.AdjustSphereToPlane (Shape 1 — breaks OnWalkable for sloped walkables)
  • Any suppression flag, grace period, retry loop, or if (problematicState) return early workaround

For the airborne bug:

  • Re-attempting any of the 6 hypotheses listed above
  • Speculation about init fields without comparing to a live capture
  • Adding more probes randomly — we already have 4+ probes wired

What apparatus exists to use

Tool Location Purpose
[step-walk] probe TransitionTypes.cs (many call sites) Per-step-site full state dump
[step-walk-adjust] probe TransitionTypes.cs:AdjustOffset Per-AdjustOffset call branch + zGain
[resolve] probe PhysicsEngine.cs end of ResolveWithTransition Per-call input/output/hit/cp summary
[indoor-bsp] probe TransitionTypes.cs:1917-1926 Per-indoor-BSP-call summary (only when BSP non-null)
[poly-dump] probe BSPQuery.cs:402 Per-AdjustSphereToPlane polygon hit dump
[push-back] probe BSPQuery.cs:354-394 Per-push-back motion details
[place-fail] probe TransitionTypes.cs:2908 Per-DoStepDown placement_insert rejection
Issue98CellarUpReplayTests tests/.../Physics/ 7 tests, single-frame failing-frame geometry
CellarUpTrajectoryReplayTests tests/.../Physics/ 5 tests, N-tick trajectory harness
Cell fixtures tests/.../Fixtures/issue98/*.json 3 hydratable cells (cellar + 2 cottage neighbors)
Retail cdb captures docs/research/2026-05-23-a6-captures/ Multiple capture sessions, decoded
cdb scripts tools/cdb/.cdb + tools/cdb/.ps1 Re-runnable retail-side capture infrastructure

Build a side-by-side comparison harness against live PlayerMovementController state.

Concretely:

  1. In the live client, attach a probe to PlayerMovementController.cs:1105-1129 (the production ResolveWithTransition call site) that captures the FULL state passed in (every PhysicsBody field, sphere radius/height, step heights, mover flags, entity id) and the FULL state returned (ResolveResult fields, body state after the call).
  2. Walk in a Holtburg cottage cellar. Capture 2-3 ticks of full state.
  3. Save the capture as a JSON fixture in docs/research/.
  4. Add a test to CellarUpTrajectoryReplayTests.cs that loads that fixture and feeds the EXACT captured state into ResolveWithTransition. Compare per-field divergence between the captured ResolveResult and the harness's result.
  5. The divergence WILL exist (otherwise we wouldn't have the airborne bug). The first divergence pinpoints the missing state init step.

This approach is evidence-driven, not speculation-driven. The whole reason the 6-hypothesis investigation failed is we kept guessing what the harness was missing. A live capture tells us directly.

Estimated effort: 1 hour to wire the production-side probe + capture + JSON dump; 30 min to write the comparison test; 30 min to analyze the first divergence. Total ~2 hours, then the airborne bug should be solvable.


Alternative next-session moves

If the comparison harness investment feels too big, here are smaller alternatives:

  1. Pivot to a different M1.5 issue. The cellar-up demo isn't the only M1.5 critical path. Other issues in docs/ISSUES.md that need work: chronic open issues (#2, #4, #28, #29, #37, #41), the #90 workaround removal (now redundant after slice 3), or one of the Phase C visual fidelity items. Less coupling, faster forward progress.

  2. Pivot to M2 prep. M1.5 is blocking M2 by policy ("one active milestone at a time"). But if the user authorizes, M2 has nicer scope — inventory panel (F.2), combat math (F.3), dev panels (F.5a). Visible wins, no physics rabbit holes.

  3. Use the harness elsewhere. The RegisterStairRampGfxObj + AttachSyntheticBsp patterns are reusable for ANY indoor-static-collision test. If there's a different bug (corpse pickup boundary, door swing collision, etc.) that needs deterministic testing, the harness's apparatus is ready.


Pickup prompt for next session

A6.P3 #98 trajectory harness — session paused 2026-05-23.

Read FIRST:
  docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md (this file)
  tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs
  (especially the class-doc comment + the 5 [Fact] tests)

State both altitudes:
  Currently working toward: M1.5 — Indoor world feels right
  Current phase: A6.P3 — trajectory replay harness, blocked on a SECOND
  bug (airborne-at-tick-1) that surfaced during commissioning. The
  original #98 cellar-up freeze remains unfixed; the harness needs
  the airborne bug solved before it can drive #98 fix attempts.

The handoff doc has three options for what to do next:

(A) Build the side-by-side comparison harness — capture live
    PlayerMovementController state, replay in test, diff. ~2 hours.
    Most retail-faithful path. Recommended.

(B) Pivot to a different M1.5 issue (chronic open issues, #90 removal,
    Phase C work). Less coupling, faster wins.

(C) Pivot to M2 prep (requires user authorization — M2 is policy-deferred
    until M1.5 lands).

Pick A, B, or C. If A: there's a step-by-step plan in the handoff
doc's "Recommended next-session move" section.

CLAUDE.md rules apply throughout. NO speculative fixes — the saga has
six failed shapes already. Evidence first.

Test baseline: 1172 + 8 (pre-existing failures). Maintain throughout.