diff --git a/CLAUDE.md b/CLAUDE.md index 5b71754..ed22794 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -872,13 +872,49 @@ Two commits (`cc3afbc` → `97fec19`): by stashing the cottage helper and reproducing the same flaky range. Out of scope for this session; tracked as follow-up. -**Next-session move:** investigate the residual +X edge-slide divergence -in `Transition.transitional_insert` / `AdjustOffset`'s handling of a -`cn=(0,0,-1)` head-bump. Live treats it as a Z-only constraint and -slides the remaining XY motion along the cottage floor; harness blocks -the entire move vector instead. The harness's -`LiveCompare_FirstCap_ResidualXMotionDivergence_DocumentsNextInvestigation` -test gives <1s feedback per fix attempt. ~2 hours estimate. +**Evening v3 finding (2026-05-23 PM, even later) — NEW root-cause +hypothesis identified:** the cottage-floor cap is a SYMPTOM. The actual +bug is **stale ramp contact plane causing per-tick Z drift** that makes +the cap reachable in the first place. + +Evidence: +- Body's contact plane at cap = ramp's plane (n=(0, 0.7190, 0.6950), + d=-69.5035) from the live capture's `bodyBefore` +- Cellar ramp's actual world XY: X∈[129.7, 131.3], Y∈[10.19, 13.09] + (computed from the cellar cell fixture's vertex data + WorldTransform) +- Player position at cap: world (141.5, 7.22, 92.74) — **10 m away** + from the ramp in cell-local X +- `AdjustOffset` projects requested motion along the contact-plane + perpendicular. Math: dot((0.0266, -0.4022, 0), (0, 0.719, 0.695)) + = -0.2892 → projected = (0.0266, -0.1943, +0.2010). **+0.201 m of + Z gain per tick**, applied because the engine believes the player + is on the slope. +- Head sphere top at cap = foot Z + 1.68 = 94.42. Cottage floor at + Z=94.00. **Head sphere exceeds cottage floor by 0.42 m** → cap fires +- If the contact plane refreshed to the flat cellar floor when the + player walked off the ramp, AdjustOffset would produce zero Z gain + (no Z component in requested motion + horizontal-plane perpendicular). + No drift, no cap. + +How this question surfaced: user asked "we know how retail OPENs it +from above, how hard can it be to know how to open it from below?" — +that reframing made the question "what's different about our state +when walking up vs down?" The answer: **nothing, actually — the +cottage geometry is the same. But our contact plane is wrong.** The +six prior fix attempts were all investigating the cap-event mechanics +(step-up, slope projection at the cap, edge-slide, SidesType, +X +residual). None questioned why the contact plane was the ramp at all +when the player was 10 m from the ramp. + +**Next-session move:** verify the stale-contact-plane hypothesis +chronologically against the live capture (walk the JSONL records, find +the last tick the player was on the actual ramp, quantify Z drift), +then locate the walkable-refresh code path in +`Transition.FindEnvCollisions` / `SpherePath.SetWalkable` that's +supposed to detect a new walkable polygon under the sphere and +overwrite the contact plane. Retail decomp anchor: +`CObjCell::find_env_collisions`. Full pickup prompt at the bottom of +[`docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md`](docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md). Original demo scenario (Holtburg Sewer end-to-end) is unreachable: sewer doesn't exist on this server, and **issue #95** (portal-graph visibility blowup) blocks any substitute dungeon. Revised M1.5 demo split into diff --git a/docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md b/docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md index 0be9de4..264e402 100644 --- a/docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md +++ b/docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md @@ -13,11 +13,36 @@ documents the FIRST evidence-driven step in the saga. ## TL;DR -**Updated 2026-05-23 evening v2: apparatus convergence shipped.** The -harness now reproduces the live cottage-floor cap event bit-perfect on -the collision normal. The residual divergence is a single +X-motion -edge-slide gap; everything else round-trips. The session below covers -both arcs (root-cause identification THEN convergence). +**Updated 2026-05-23 evening v3: NEW root-cause hypothesis identified — +STALE RAMP CONTACT PLANE causes per-tick Z drift, which is what makes +the cottage-floor cap reachable in the first place.** + +- Player position at cap: world (141.5, 7.2, 92.7). The cellar ramp's + actual world XY is X=[129.7, 131.3] — the player is **10 meters away + from the ramp** in cell-local space. +- Body's contact plane: ramp's plane (n=(0, 0.719, 0.695), d=-69.5035). + Stale; should be the flat cellar floor (n=(0,0,1)). +- AdjustOffset projects forward motion along that stale ramp plane. + Mathematically: requested delta (+0.0266, -0.4022, 0) → projected + (+0.0266, -0.1943, +0.2010). **+0.2010 m of Z lift per tick.** +- After enough horizontal-walking ticks, the head sphere rises to + Z=94 and hits the cottage floor's downward-facing back-face polygon. + Cap fires. +- The cap is a SYMPTOM. The root cause is the contact plane not + refreshing when the player walks off the ramp onto the flat cellar + floor. Retail must re-find the walkable plane each tick; we're + keeping the stale ramp seed. + +**This explains why six prior fix attempts missed.** Step-up, +AdjustOffset projection, SidesType, edge-slide, +X residual — all +were investigating the cap event mechanics, not the upstream Z drift +that made the cap reachable. The harness convergence (Section "What +shipped 2026-05-23 evening v2") is still valuable as the deterministic +reproduction infrastructure; the new hypothesis is the **next** thing +to verify against that infrastructure. + +(Sections below preserve the evening-v2 arc for context: apparatus + +cap-event reproduction.) - **Evidence-driven apparatus shipped.** `PhysicsResolveCapture` writes one JSON Lines record per player ResolveWithTransition call when @@ -317,52 +342,181 @@ range. All 21 issue-#98-relevant tests (12 harness + 4 --- +## The stale-contact-plane finding — full evidence (2026-05-23 evening v3) + +### How the question led to the answer + +User asked: "We know how retail OPENs it from above, how hard can it +be to know how to open it from below?" — the implicit question being +"if walking on the cottage floor from above works fine, why doesn't +walking up from below?" + +That reframed the investigation. The cottage floor is the SAME +polygon set whether viewed from above (walking on it) or below +(head-bumping it from the cellar). Retail handles both. If our cap +fires from below, what's different about our state? + +Tracing the harness's `LiveCompare_FirstCap_DiagnosticDump` output +revealed: + +1. **The contact plane the engine started with**: ramp's plane + `n=(0, 0.7190, 0.6950), d=-69.5035`. From the live capture's + `bodyBefore.contactPlane`. + +2. **Cellar ramp's actual world position**: vertices computed from + the cellar cell's fixture put the ramp at world + X∈[129.7, 131.3], Y∈[10.19, 13.09], Z∈[92.5, 95.5]. The ramp is + in the +Y corner of the cellar, ~1.6 m wide. + +3. **Player position at cap**: world (141.5, 7.22, 92.74). 10+ m + away from the ramp in X. + +4. **The +Z drift math**: `AdjustOffset` projects the requested + motion onto the plane perpendicular to the contact-plane normal: + - requested = (+0.0266, -0.4022, 0) + - dot(requested, ramp normal) = 0·0.0266 + 0.719·(-0.4022) + + 0.695·0 = -0.2892 + - projected = requested - (-0.2892)·rampNormal = + (+0.0266, -0.1943, +0.2010) + - **+0.2010 m of Z gain per tick**, applied because the contact + plane the engine believes the player is on is the slope. + +5. **The cap math**: foot Z at cap = 92.74. Head sphere center at + foot Z + sphereHeight 1.2 = 93.94. Head sphere top at + foot Z + 1.68 = 94.42. **Cottage floor at world Z=94.00.** Head + sphere top exceeds cottage floor by 0.42 m → cap fires from + below. + +If the contact plane were the flat cellar floor (n=(0,0,1) at +Z=90.95) instead of the ramp, AdjustOffset's projection would +produce zero Z gain (requested motion has no Z component, projection +onto flat-floor plane preserves XY). No drift, no cap. + +### Why this fits the user-facing bug + +- "Stuck climbing cellar" — the player walks forward, accumulates Z, + bumps cottage floor, can't progress. Matches what the user sees. +- "Pure jump in cellar caps at same Z" — jumping doesn't refresh the + contact plane either. Drift continues. Matches. +- "Six prior fix attempts failed" — all attempted to fix the CAP + mechanics (step-up, slope projection at the cap, edge-slide). None + questioned why the contact plane was the ramp at all. + +### What still needs verification (next session's task) + +1. **Chronological evidence**: walk the live capture from the start of + the cellar session. When did the player last stand on the actual + ramp? Does `bodyBefore.contactPlane` persist as the ramp's plane + across many ticks of horizontal walking? Quantify the cumulative + Z drift. + +2. **The walkable-refresh gap**: where in + `Transition.FindEnvCollisions` / `SpherePath.SetWalkable` / + related is the contact plane supposed to be refreshed when the + sphere is over a different walkable polygon? Retail's + `CObjCell::find_env_collisions` is the decomp anchor — find the + path that detects a NEW walkable and overwrites the contact + plane, and find where our engine skips that. + +3. **Retail cdb cross-check** (optional, definitive): attach cdb to a + running retail acclient, walk to a cottage cellar, log the + contact plane each tick. If retail's contact plane refreshes + to (0,0,1) when the player walks off the ramp, hypothesis + confirmed. + +--- + ## Pickup prompt for next session ``` -A6.P3 #98 — apparatus convergence landed, residual X-motion divergence -is next. +A6.P3 #98 — apparatus convergence landed, NEW root-cause hypothesis +(stale ramp contact plane) needs verification. -Read FIRST: +Read FIRST (in order, ~15 min): 1. docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md - (this doc — particularly the "What shipped 2026-05-23 evening v2" - and "The residual divergence" sections) - 2. tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs - (especially the two LiveCompare_FirstCap_* tests and the - RegisterCottageGfxObj helper) - 3. CLAUDE.md "Current A6 phase" block + — start with TL;DR (evening v3 update at top), then the section + "The stale-contact-plane finding — full evidence" near the bottom. + Skip the middle sections (evening v1 + v2 arcs) unless context is + needed. + 2. CLAUDE.md "Current A6 phase" block — look for the "Evening v3 + finding" paragraph. + 3. tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs + — the RegisterCottageGfxObj helper + 2 LiveCompare_FirstCap_* + tests are what you'll iterate against. -State both altitudes: +State both altitudes (one sentence each): Currently working toward: M1.5 — Indoor world feels right - Current phase: A6.P3 — apparatus convergence shipped. Harness now - reproduces the live cottage-floor cap event (cn=(0,0,-1) round-trips - bit-perfect). Residual: +0.0266 m of +X motion lost in the harness's - post-cap slide where live preserves it. + Current phase: A6.P3 — apparatus convergence shipped (cap event + reproduces bit-perfect). New root-cause hypothesis: stale ramp + contact plane causes per-tick Z drift that makes the cap reachable. + Needs verification. -Two concrete next moves: +What was shipped today (3 commits — DO NOT REDO): + - cc3afbc: GfxObj dump infrastructure (ACDREAM_DUMP_GFXOBJS) + - 97fec19: Harness reproduces cottage-floor cap (RegisterCottageGfxObj) + - 7729bdc + (this commit): findings doc + CLAUDE.md updates -(A) Investigate the +X edge-slide divergence in the harness. The - LiveCompare_FirstCap_ResidualXMotionDivergence_DocumentsNextInvestigation - test currently passes asserting the divergence; flipping it should - drive the investigation. Likely target: Transition.transitional_insert - / AdjustOffset's handling of a cn=(0,0,-1) head-bump — live treats - it as Z-only constraint and edge-slides the remaining XY motion; - harness blocks all motion. Decomp anchor: acclient_2013_pseudo_c.txt - in the find_obj_collisions → adjust_sphere_to_plane chain. ~2 hours - estimate. +The hypothesis with full math: + - Body's contact plane = ramp's plane (n=(0,0.719,0.695), d=-69.5035) + - Player position at cap = world (141.5, 7.22, 92.74) + - Cellar ramp's actual world XY = X∈[129.7, 131.3] — 10m from player + - AdjustOffset projects requested move along contact-plane perpendicular + - Per-tick Z gain ≈ 0.201m from slope projection on STALE ramp plane + - Accumulates over ticks → head sphere reaches Z=94 → bumps cottage + floor → cap fires + - If contact plane refreshed to flat cellar floor (n=(0,0,1)) when + player walks off ramp, no Z drift, no cap -(B) Attach cdb to retail at the cottage ramp-top, trace the BSP queries, - compare polygon-by-polygon what retail finds vs what acdream finds. - Authoritative for the "how does retail differ?" question but - larger scope (~half day setup + capture). +Concrete next moves (in order): -(A) is recommended — the harness now isolates this divergence to a -specific known XY slide path; the test gives <1s feedback per fix -attempt. (B) becomes valuable if (A) hypothesis chase stalls. +(1) **Verify the hypothesis chronologically.** Walk + a6-issue98-resolve-capture-2.jsonl (or the cottage capture + fixture's full file) from the start. Find when the player last + stood on the actual ramp (within world X∈[129.7, 131.3], Y∈[10.19, + 13.09]). Quantify: how many ticks does the body's contact plane + persist as the ramp's plane while the player walks horizontally + away? Compute the cumulative Z drift. Should match observed Z=92.74 + at cap if the hypothesis holds. (Probably 30 min PowerShell jq.) -CLAUDE.md rules apply throughout. NO speculative fixes — the saga -already converted from speculation to evidence-driven; keep it that -way. +(2) **Locate the walkable-refresh code path.** In + src/AcDream.Core/Physics/TransitionTypes.cs, search for where + Transition.FindEnvCollisions or SpherePath.SetWalkable is supposed + to detect a new walkable polygon under the sphere and overwrite + the contact plane. The fix likely lives at the call site that + EITHER fails to fire OR fires but doesn't replace the existing + contact plane. + +(3) **Cross-ref retail decomp.** acclient_2013_pseudo_c.txt's + CObjCell::find_env_collisions + the walkable-detection chain. + Find the path where retail unconditionally replaces + contact_plane when a new walkable is found. Quote the line + numbers in the fix commit. + +(4) **Implement the fix + verify against harness.** The harness's + LiveCompare_FirstCap_HarnessReproducesCottageFloorCapNormal test + currently PASSES asserting the cap reproduces. After the fix, + if the contact plane refreshes correctly, the cap should NOT fire + (no Z drift to make it reachable). The test should start FAILING + — that's the signal the fix works. + +(5) **Visual verification (user-side).** Launch acdream live, walk + into a Holtburg cottage, down to the cellar, then back up. The + user-facing bug should resolve if the hypothesis is correct. + +Decomp grep targets: + - CObjCell::find_env_collisions + - CPhysicsObj::find_object_collisions + - CTransition::find_walkable + - CSpherePath::set_walkable / walkable_hits_sphere + - OBJECTINFO::object → contact_plane writes + +CLAUDE.md rules apply throughout: + - NO speculative fixes — the saga's converted to evidence-driven. + Verify hypothesis with chronological capture BEFORE coding. + - Visual verification belongs to the user. + - If the chronological verification (step 1) shows the contact + plane is NOT actually stale across many ticks, the hypothesis is + wrong — pivot to retail cdb trace (definitive oracle). Out-of-scope but observed: pre-existing test suite has 8–19 failures across runs of the same code due to static-state leakage between test @@ -370,6 +524,10 @@ classes (PhysicsResolveCapture, PhysicsDiagnostics statics). Targeted issue-#98 tests pass deterministically in isolation. Don't touch the flakiness this session; it's a separate investigation. +Test baseline: harness's 12 CellarUpTrajectoryReplayTests + 4 +GfxObjDumpRoundTripTests + 1 new PhysicsDiagnosticsTests + 4 +CellDumpRoundTripTests all pass in isolation. Maintain. + Test baseline: 1178 + 8 pre-existing failures (serial run). Maintain throughout. The previously-failing LiveCompare_FirstCap_HarnessMissesCottageFloorBecauseCottageGfxObjNotRegistered