Closes the apparatus loop. Side-by-sides acdream's deterministic replay
(commit 856aa78) against retail's cdb capture taken via Step 4's
runner. The divergence target is named; the fix plan is the next plan.
Retail data (cellar_up_capture_1):
- 35,219 BP hits over ~5 seconds of motion
- BPE (set_contact_plane): 161 writes, ALL to one of two flat planes
(n=(0,0,1) d=-93.9998 = cottage floor @ Z=94, OR d=-90.95 = cellar
floor @ Z=90.95). Retail NEVER sets ContactPlane to the cellar ramp.
- BPC (find_crossed_edge): 1 hit in 35K. Retail barely uses this
predicate during cellar-up.
- BPA (find_walkable) sphere position at each cottage-floor
acceptance: sphere LOCAL Z = +0.48 to +0.63 (resting on top of the
floor plane). Sphere world Z ≈ 94.48.
acdream replay (Issue98CellarUpReplayTests):
- At the failing-frame sphere (world (141.7, 8.4, 92.0)), the cottage
cell 0xA9B40143's poly 0x0004 reports insideEdges=false AND
overlapsSphere=false. Sphere local Z = -0.69 (below the cottage
floor plane). 0xA9B40146 has no walkable candidate at all. Step-up
has nothing to step onto → stuck.
Sphere world Z delta: 2.47m. Retail's sphere is 2.5m higher than ours
at the decision point. The fix targets, in priority order:
1. (HIGHEST CONFIDENCE) Step-up + ramp climb doesn't gain enough Z per
tick. Retail climbs the ramp GRADUALLY across thousands of ticks;
ours oscillates at world Z ≈ 92 without altitude gain. Look at
Transition.AdjustOffset (slope projection) and Transition.DoStepUp
(does it reset WalkInterp like retail's step_sphere_up?).
2. Cottage-cell candidacy uses wrong sphere reference. Check what
sphere CheckOtherCells passes to BSPQuery.FindCollisions — is it
the step-lifted sphere or the pre-step sphere?
3. (SECONDARY) find_crossed_edge over-use. Our walkable test calls
FindCrossedEdge heavily; retail barely uses it. Possibly a
code-shape mismatch in step-up vs walkable-acceptance flow.
4. (LOW CONFIDENCE) Ramp polygon normal divergence. Verify via test
after any fix.
The apparatus that gets us here:
- tests/AcDream.Core.Tests/Fixtures/issue98/*.json (real cell geometry)
- Issue98CellarUpReplayTests (7 tests, <1ms each, deterministic bug
reproduction)
- tools/cdb/issue98-runner.ps1 (reusable for any future capture)
- docs/research/2026-05-23-a6-captures/cellar_up_capture_1/ (this
capture, checked in for future analyses)
Next plan: pick Target 1 or 2 from the comparison doc and write the
fix plan against it. The replay harness is the test loop; a fix that
doesn't change the failing assertions in Issue98CellarUpReplayTests is
not the fix.
334 lines
15 KiB
Markdown
334 lines
15 KiB
Markdown
# A6.P3 issue #98 — acdream replay vs retail cdb comparison
|
|
|
|
**Date:** 2026-05-23
|
|
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
|
|
**Status:** Apparatus complete. Divergence identified. Fix plan to follow.
|
|
|
|
This document closes the loop on Step 5 of
|
|
[`C:\Users\erikn\.claude\plans\i-did-some-work-sharded-acorn.md`](../../C:/Users/erikn/.claude/plans/i-did-some-work-sharded-acorn.md).
|
|
It compares acdream's deterministic-replay output against the retail
|
|
cdb capture taken at the equivalent scenario, and names the
|
|
divergence target for the (next) fix plan.
|
|
|
|
The four prior sessions (2026-05-22 AM + PM, 2026-05-23 AM + PM)
|
|
shipped 10+ speculative fixes without data. This session shipped the
|
|
apparatus that turns the next attempt into evidence-driven work
|
|
(commits `35b37df` → `6f666c1` on top of slice 5's `cf3deff`).
|
|
|
|
---
|
|
|
|
## TL;DR — the divergence target
|
|
|
|
**Retail's `BSPLEAF::find_walkable` accepts the cottage main floor
|
|
polygon when the sphere is RESTING ON TOP of it.** Sphere local
|
|
Z = +radius (= +0.48 in the cottage cell). Sphere world Z ≈ 94.48
|
|
(cottage floor at world Z=94, plus radius).
|
|
|
|
**acdream's failing-frame sphere is 0.69m BELOW the cottage main floor
|
|
plane** when our walkable query runs. Sphere local Z = -0.6883 in
|
|
0xA9B40143. Sphere world Z ≈ 93.31.
|
|
|
|
Delta: **retail's sphere is 1.17 m higher** at the equivalent decision
|
|
point. Either:
|
|
|
|
1. Our step-up sequence doesn't lift the sphere high enough before
|
|
`find_walkable` is called against the cottage cell, OR
|
|
2. We're calling `find_walkable` against the cottage cell using the
|
|
wrong sphere reference (foot-sphere center instead of the step-
|
|
lifted center), OR
|
|
3. The cellar→cottage transition in retail happens GRADUALLY across
|
|
many physics ticks (the sphere climbs the ramp one step at a time),
|
|
and acdream's per-tick climb is too small.
|
|
|
|
The fix plan needs to choose between (1), (2), and (3) — most likely
|
|
(3) given retail's BPE-write distribution.
|
|
|
|
A surprising secondary finding: **`CPolygon::find_crossed_edge` fires
|
|
ONLY ONCE in 35K probe hits in retail.** Our replay harness uses
|
|
`FindCrossedEdge` as the primary edge-containment test. Either retail
|
|
takes a different path through the walkable predicate cascade, or
|
|
acdream is over-reliant on the edge test for a case retail doesn't
|
|
hit.
|
|
|
|
---
|
|
|
|
## Apparatus shipped this session
|
|
|
|
Six commits on top of `cf3deff` (slice 5):
|
|
|
|
| Commit | What |
|
|
|---------|------|
|
|
| `35b37df` | chore(phys): A6.P3 #98 triage — revert neg-poly + bldg-check experiments. Kept: render-vs-physics origin split (GameWindow), terrain-hole cutout, multi-sphere CellTransit, step-walk diagnostic probes. Reverted: neg-poly path split, bldg-check flag, isBuilding propagation, IsLandblockBuilding. Test baseline restored to 1148+8 base. |
|
|
| `f62a873` | feat(phys): Step 2 — cell-dump probe (`ACDREAM_DUMP_CELLS=0xA9B4xxxx,...`) + JSON DTOs (`CellDump`, `PolygonDump`, etc.) + `CellDumpSerializer` (Capture / Read / Write / Hydrate) + 4 round-trip tests. |
|
|
| `3f56915` | capture(phys): Three cell fixtures from live capture — 0xA9B40143 (14 polys), 0xA9B40146 (4 polys), 0xA9B40147 (37 polys). All share worldOrigin=(130.5, 11.5, 94.0) with 180° yaw. |
|
|
| `856aa78` | test(phys): Step 3 — `Issue98CellarUpReplayTests` — 7 tests reproducing the live failure pattern deterministically (<1ms per test). Confirms 0xA9B40143 poly 0x0004 rejected at the failing-frame sphere; 0xA9B40146 has no walkable candidate at all. |
|
|
| `6f666c1` | tools(cdb): Step 4 — `issue98-cellar-up-find-walkable.cdb` + `issue98-runner.ps1` for retail-side capture. BPA/B/C/D/E/F break on find_walkable, walkable_hits_sphere, find_crossed_edge, check_other_cells, set_contact_plane, adjust_sphere_to_plane. |
|
|
| (this doc) | Step 5 — divergence comparison. |
|
|
|
|
---
|
|
|
|
## Raw data — retail cdb capture
|
|
|
|
Capture: [`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.log`](2026-05-23-a6-captures/cellar_up_capture_1/retail.log)
|
|
(decoded: `retail.decoded.log`)
|
|
|
|
User ran retail acclient.exe v11.4186 attached via
|
|
`tools/cdb/issue98-runner.ps1 -ScenarioTag "cellar_up_capture_1"`. They
|
|
walked up and down a Holtburg cottage cellar stair several times. cdb
|
|
captured 35,219 BP hits over ~5 seconds of motion.
|
|
|
|
Hit distribution:
|
|
|
|
| BP | Function | Hits | Notes |
|
|
|-----|----------------------------------------------|--------|-------|
|
|
| BPA | `BSPLEAF::find_walkable` | 6,160 | per-leaf walkable query |
|
|
| BPB | `CPolygon::walkable_hits_sphere` | 7,028 | per-polygon overlap test |
|
|
| BPC | `CPolygon::find_crossed_edge` | **1** | almost never fires! |
|
|
| BPD | `CTransition::check_other_cells` | 21,422 | outer dispatcher fires very frequently |
|
|
| BPE | `COLLISIONINFO::set_contact_plane` | **161**| ContactPlane writes |
|
|
| BPF | `CPolygon::adjust_sphere_to_plane` | 431 | sphere projections |
|
|
|
|
### BPE — retail's accepted ContactPlanes
|
|
|
|
Every one of the 161 BPE writes lands on one of TWO planes:
|
|
|
|
```
|
|
n=(0, 0, 1) d=-93.9998 → world Z=94 (cottage main floor)
|
|
n=(0, 0, 1) d=-90.9500 → world Z=90.95 (cellar floor)
|
|
```
|
|
|
|
Retail's ContactPlane is **never** set to:
|
|
- the cellar ramp (normal ≈ (0, -0.719, 0.695))
|
|
- any of the cellar wall polygons
|
|
- the cellar ceiling (poly 0x0020 in our nomenclature — normal=(0,0,-1) at world Z=93.82)
|
|
|
|
The transition cellar floor → cottage main floor happens directly:
|
|
ContactPlane shifts from `d=-90.95` to `d=-93.9998` with no
|
|
intermediate plane.
|
|
|
|
### BPA — sphere position at each cottage-floor acceptance
|
|
|
|
The find_walkable call immediately before each BPE write to the
|
|
cottage floor shows a consistent sphere position pattern:
|
|
|
|
| BPE hit | Last BPA before | Sphere LOCAL | Notes |
|
|
|---------|------------------------|-------------------------------|-------|
|
|
| #1 | hit#435 (cell B) | (-0.3270, 0.5998, +0.6300) | first cottage-floor accept |
|
|
| #50 | hit#2533 (cell B) | (-0.3131, 0.7340, +0.6300) | cz unchanged |
|
|
| #100 | hit#3822 (cell B) | (-0.3245, 0.3292, +0.6300) | cz unchanged |
|
|
| #160 | hit#6159 (cell B) | (-0.3195, 0.5271, +0.6300) | cz unchanged |
|
|
|
|
Sphere local Z is consistently **+0.6300** in cell B at the moment
|
|
retail accepts. Cell B's cottage floor plane is at local Z=-0.15
|
|
(observed from BPB hit#7012 with plane d=-0.15), so the sphere is
|
|
0.78m above that floor. Sphere radius 0.48 → sphere bottom is 0.30m
|
|
above the floor — close enough that `walkable_hits_sphere` accepts.
|
|
|
|
The find_walkable hit just BEFORE the cell-B query (hit#433, hit#2532,
|
|
hit#3820, hit#6158) lands in a different cell ("cell A") at local
|
|
position ≈ `(-11.12, 7.16, +0.48)`. Cell A's cottage floor plane is at
|
|
local Z=0 → sphere is 0.48m above (= sphere radius), perfectly resting
|
|
on the floor.
|
|
|
|
**Both cells consistently see the sphere at `local Z = +0.48 to +0.63`
|
|
at the acceptance moment.** Sphere world Z ≈ 94.48 — the sphere has
|
|
been lifted ABOVE the cottage floor.
|
|
|
|
---
|
|
|
|
## acdream replay — sphere position at the equivalent moment
|
|
|
|
Replay anchor: failing-frame sphere world position
|
|
`(141.7164, 8.3937, 92.0093)` r=0.4800, from
|
|
[`a6-issue98-negpoly-20260523-135032.out.log`](../../a6-issue98-negpoly-20260523-135032.out.log)
|
|
line 11338 (`[walkable-nearest]`) + 11339 (`[issue98-walkable-detail]`).
|
|
|
|
In cell 0xA9B40143 (cottage neighbour, 14 physics polys):
|
|
|
|
```
|
|
sphere LOCAL = (-11.2892, 4.3653, -0.6883)
|
|
nearest walkable: poly 0x0004
|
|
plane n=(0,0,1) d=0 (local) → world Z=94 (cottage floor)
|
|
verts: [(-6.2, 7.6, 0), (-10.0, 7.6, 0), (-10.0, 2.8, 0)]
|
|
signed distance from plane: -0.6883
|
|
abs distance: 0.6883
|
|
gap (abs - radius): 0.2083
|
|
insideEdges: FALSE (sphere XY beyond triangle edge by 1.29 m on X)
|
|
overlapsSphere: FALSE (|0.6883| > radius 0.48)
|
|
```
|
|
|
|
In cell 0xA9B40146 (cottage neighbour, 4 physics polys):
|
|
|
|
```
|
|
sphere LOCAL = (similar)
|
|
nearest walkable: NONE
|
|
(the cell has no Z-up polygon close enough to be selected)
|
|
```
|
|
|
|
In cell 0xA9B40147 (cellar primary, 37 physics polys):
|
|
|
|
```
|
|
sphere LOCAL = (-11.2164, 3.1063, -1.9907)
|
|
nearest walkable: the cellar ramp (poly 0x0008 — n=(0,-0.719, 0.695))
|
|
→ accepted as ContactPlane
|
|
```
|
|
|
|
Our replay confirms the live failure: cottage-cell walkable queries
|
|
return no usable result; cellar ramp is the only ContactPlane we ever
|
|
get.
|
|
|
|
---
|
|
|
|
## Side-by-side comparison
|
|
|
|
| Field | Retail (BPE #1) | acdream (negpoly fail) |
|
|
|-----------------------------------------|---------------------|-------------------------|
|
|
| Sphere world Z | **94.48** | **92.01** |
|
|
| Cottage floor plane (world) | Z = 94 | Z = 94 |
|
|
| Sphere position vs cottage floor | **+0.48 m ABOVE** | **-1.99 m BELOW** |
|
|
| Sphere top vs cottage floor | +0.96 m above | -1.51 m below |
|
|
| Walkable accepted in cottage cell? | **YES** — sphere rests on plane | **NO** — sphere far below plane |
|
|
| ContactPlane set to cottage floor? | **YES** (161 times) | **NO** (never) |
|
|
| find_crossed_edge invocations | 1 (in 35K BPs) | (used heavily by our walkable test) |
|
|
| check_other_cells invocations | 21,422 | (per-tick, similar order) |
|
|
|
|
**Sphere world Z delta: 2.47 m.** Retail's sphere is nearly 2.5 m
|
|
higher than ours at the equivalent decision point.
|
|
|
|
---
|
|
|
|
## Plausible fix targets, in priority order
|
|
|
|
These are HYPOTHESES — the fix plan must verify each before changing
|
|
code. Each is testable against the replay harness without launching
|
|
the client.
|
|
|
|
### Target 1 (highest confidence): step-up + ramp climb doesn't gain enough Z per tick
|
|
|
|
Retail's data shows the sphere climbs the ramp GRADUALLY across many
|
|
ticks — BPB hits move smoothly from sphere local Z=-2.57 (resting on
|
|
cellar floor) through intermediate values up to sphere local Z=+0.48
|
|
(resting on cottage floor) over ~7,000 walkable_hits_sphere calls.
|
|
|
|
Our `[step-walk]` diagnostic from the failing log shows the sphere
|
|
oscillating at world Z ≈ 92.0 — never gaining altitude. The ramp's
|
|
ContactPlane is being set but `AdjustOffset` is consuming all
|
|
WalkInterp on the lift, leaving nothing for forward motion (slice 7
|
|
handoff's reading was right on this).
|
|
|
|
Look at:
|
|
- `Transition.AdjustOffset` — when ContactPlane is the ramp, forward
|
|
motion should project to ramp-local, gaining Z. Does it?
|
|
- `Transition.DoStepUp` — when does step-up fire? Is it lifting by
|
|
the right amount? Compare to retail's step_sphere_up.
|
|
- The interaction between WalkInterp depletion and step-up — does our
|
|
step-up reset WalkInterp like retail does?
|
|
|
|
### Target 2: cottage-cell candidacy uses wrong sphere reference
|
|
|
|
Retail iterates cells with the SAME sphere across find_walkable calls
|
|
in a tick. The sphere position visible to find_walkable for the
|
|
cottage cell is already at the lifted position. acdream's
|
|
`CellTransit.FindCellSet` uses `sp.GlobalSphere` — but at what tick
|
|
phase? If we use the pre-step-up sphere center to decide cottage-cell
|
|
candidacy, but then run the walkable query at the same pre-step-up
|
|
position, we'll never see the cottage cell as walkable.
|
|
|
|
Look at:
|
|
- `CheckOtherCells` in `TransitionTypes.cs` — what sphere does it
|
|
pass to `BSPQuery.FindCollisions`? Does it use the step-lifted
|
|
position or the pre-step position?
|
|
- The retail oracle `CTransition::check_other_cells` at
|
|
`acclient_2013_pseudo_c.txt:272717-272798`.
|
|
|
|
### Target 3: find_crossed_edge is over-used in our walkable acceptance
|
|
|
|
Retail's BPC hit count of 1 in 35K is a striking outlier. Either
|
|
retail's walkable acceptance never needs the edge containment test
|
|
(because `walkable_hits_sphere` does enough), or `find_crossed_edge` is
|
|
gated behind a different code path we're not hitting.
|
|
|
|
Look at:
|
|
- `BSPQuery.FindCrossedEdge` — when is it called? Compare to retail's
|
|
`CPolygon::find_crossed_edge`. Maybe we call it in step-up, retail
|
|
doesn't.
|
|
|
|
This is a SECONDARY target — not directly the issue #98 failure mode,
|
|
but a code-shape divergence worth investigating once the primary fix
|
|
lands.
|
|
|
|
### Target 4 (low confidence): the cellar ramp normal-Z is wrong
|
|
|
|
If our cellar ramp polygon has a slightly wrong normal compared to
|
|
retail, AdjustOffset's slope projection would compute different Z
|
|
gains. The polydump capture shows ramp normal (0, -0.7190, 0.6950);
|
|
the JSON fixture has the same. Likely not the bug, but worth
|
|
verifying via `dotnet test` after any fix attempt.
|
|
|
|
---
|
|
|
|
## What the apparatus delivers for future fix attempts
|
|
|
|
1. **`Issue98CellarUpReplayTests`** runs in <200ms with no client
|
|
launch. Any change to `BSPQuery.FindCrossedEdge`, polygon
|
|
containment, or cell transform shows up instantly.
|
|
|
|
2. **JSON fixtures in `tests/AcDream.Core.Tests/Fixtures/issue98/`**
|
|
are real-geometry captures. Any future fix can call
|
|
`CellDumpSerializer.Hydrate` to load them and drive the predicates
|
|
directly.
|
|
|
|
3. **`tools/cdb/issue98-runner.ps1`** is reusable. Any new
|
|
hypothesis can be re-captured against retail with a 5-minute user
|
|
action.
|
|
|
|
4. **`tools/cdb/decode_retail_hex.py`** decodes the hex-bits format —
|
|
no changes needed.
|
|
|
|
5. The retail comparison data is checked into
|
|
`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/` —
|
|
future analyses can re-grep without re-capturing.
|
|
|
|
---
|
|
|
|
## What this plan does NOT do
|
|
|
|
This document does not ship a fix. The fix is the next plan, scoped to
|
|
Target 1 (most likely) or Target 2 (next likely). The user should
|
|
review this divergence reading before authorizing implementation.
|
|
|
|
Per CLAUDE.md and the systematic-debugging mandate: 4 prior sessions
|
|
guessed and were wrong. This plan refuses to be the 5th.
|
|
|
|
---
|
|
|
|
## Pickup prompt for the fix plan
|
|
|
|
Open this worktree:
|
|
`C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
|
|
|
|
Then:
|
|
|
|
```
|
|
A6.P3 issue #98 — apparatus complete; ready to write the fix plan.
|
|
|
|
Read FIRST:
|
|
docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md
|
|
tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs
|
|
docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log
|
|
|
|
State both altitudes:
|
|
Currently working toward: M1.5 — Indoor world feels right
|
|
Current phase: A6.P3 — fix #98 cellar-up (fix plan)
|
|
Next concrete step: pick Target 1 (step-up Z gain) or Target 2
|
|
(cottage-cell sphere reference) from the comparison doc and write
|
|
the fix plan against it. NO speculative fixes — use the replay
|
|
harness to verify the hypothesis before writing code.
|
|
|
|
The fix MUST be evidence-driven. The replay harness gives us a 200ms
|
|
test loop; a fix that doesn't change the failing assertions in
|
|
Issue98CellarUpReplayTests is not the fix.
|
|
|
|
Test baseline: 1167 + 8 (with apparatus). Maintain through any fix.
|
|
CLAUDE.md rules apply. No workarounds without explicit approval.
|
|
```
|