acdream/docs/research/2026-05-21-a6-p1-partial-ship-handoff.md
Erik 2f2b63f8bd docs(handoff): A6.P1 partial-ship — infra DONE, captures 1/9
Pickup-prompt + lessons doc for the A6.P1 capture work. Documents:

- The 16 commits shipping today (infrastructure Tasks 1-14 + cdb
  script v1→v4 iteration + scen1 capture + decoder).
- WHY cdb iterated 4 times: v1 wrong offsets, v2 PowerShell UTF-16,
  v3 cdb %f unreliable with dwo()/@@c++, v4 hex output works.
- Scen1 findings already strong: dispatcher entry frequency mismatch
  (acdream 20× fewer than retail) + ContactPlane write blowup
  (~100-1000× more frequent in acdream) — directly confirms the
  spec's M1.5 hypothesis about per-frame CP resynthesis.
- Per-scenario protocol validated by scen1.
- Pasteable session-start prompt for picking up scenarios 2-9.
- Known issues (kill-cdb-kills-retail, .printf %f unreliable, etc).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:05:17 +02:00

344 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# A6.P1 partial-ship handoff — 2026-05-21
**Status:** Infrastructure complete + scenario 1 (Holtburg inn doorway)
captured end-to-end (retail + acdream paired). Scenarios 29 deferred to
next session.
**Pasteable session-start prompt at the bottom of this doc.**
## TL;DR
A6.P1 ships in two milestones:
1. **Infrastructure milestone (DONE today):** `[push-back]` acdream probe (3
helpers + 3 sites + DebugVM mirror + CLAUDE.md docs), cdb probe script
(v4 with PDB-verified offsets + hex-bits float output), PowerShell
runner with ASCII encoding, README, capture-dir scaffolding,
PDB-match verification, type dumper, hex→float decoder.
2. **Capture milestone (PARTIAL):** 1 of 9 scenarios captured. Scenarios
29 user-driven, deferred at user direction to avoid fatigue.
**Scenario 1 already surfaces two strong M1.5 findings** (before any
formal A6.P2 analysis):
| Metric | Retail | acdream | Notes |
|---|---:|---:|---|
| dispatcher entries (find_collisions / BSPQuery.FindCollisions) | 5,818 | 295 | acdream calls dispatcher **20× less often** |
| ContactPlane writes (set_contact_plane fn / per-field writes) | 18 calls | **73,304** field-writes | acdream **rewrites CP every frame/sub-step** vs retail's per-event |
The CP-write blowup directly confirms the spec's hypothesis
([2026-05-21-phase-a6-indoor-physics-fidelity-design.md §1.2](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md))
that `FindEnvCollisions` indoor branch resynthesizes CP per frame
instead of retaining via Mechanisms A/B/C. Same family as the
`TryFindIndoorWalkablePlane` workaround.
## State both altitudes (next session)
> **Currently working toward: M1.5 — "Indoor world feels right."**
>
> **Current phase: A6 — Indoor physics fidelity (cdb-driven).**
>
> **Next concrete step: Capture scenarios 29 (paired retail + acdream
> traces). Then run A6.P2 analysis on all 9 captures.**
## What shipped today (16 commits)
### Infrastructure (Tasks 114 from the A6.P1 plan)
| Commit | What |
|---|---|
| `ace9e62`, `ad6c89d` | T1: `ProbePushBackEnabled` toggle + roundtrip test |
| `3a173b9` | T2: `LogPushBackAdjust` helper |
| `eb8a318` | T3: instrument `BSPQuery.AdjustSphereToPlane` |
| `2d1f27d` | T4: `LogPushBackDispatch` helper |
| `35631d1` | T5: instrument `BSPQuery.FindCollisions` |
| `66ee757` | T6: `LogPushBackCellTransit` helper |
| `642734d` | T7: instrument `Transition.CheckOtherCells` |
| `dd95c10` | T8: DebugVM `ProbePushBack` mirror |
| `e1f7efe` | T9: CLAUDE.md `ACDREAM_PROBE_PUSH_BACK` env var docs |
| `7bb799b` | T10: `tools/cdb/a6-probe.cdb` v1 (broken offsets) |
| `1c640eb` | T11: `tools/cdb/a6-probe-runner.ps1` (later patched to ASCII) |
| `df315a9` | T12: `tools/cdb/README-a6-probe.md` |
| `0e21f22`, `22e341f` | T13: PDB-match verification (audit trail) |
| `260c60f` | T14: capture-dir scaffolding + findings doc stub |
### cdb script iteration (T15 dry-runs)
| Commit | What |
|---|---|
| `d0c8c54` | v1→v2 prep: type dumper (`a6-types-dump.cdb` + runner) + ASCII runner |
| `7b9b26f` | v2 cdb script: PDB-verified offsets + BP6 fix to `check_walkable` |
| `1b6d49e` | v3 cdb script: `@@c++(*(float*)addr)` for floats (still produced zeros) |
| `2d841cb` | v4 cdb script: hex-bits float output via `%08X` (WORKS) |
### Scen1 capture + decode tooling
| Commit | What |
|---|---|
| `180b4a5` | scen1 retail.log captured (v4 cdb, 13,552 hits, real hex bits) |
| `8ca718a` | scen1 acdream.log paired (84,130 lines, full probe distribution) |
| `194ed3e` | `decode_retail_hex.py` — Python hex→float decoder + scen1 decoded log |
## Why cdb v1→v4 iteration was necessary
The cdb side hit three landmines we didn't anticipate when writing the
A6.P1 plan:
1. **v1: Stack-arg offsets wrong.** Plan's probe actions used arbitrary
registers (`@edx`, `@edi`) to read function args. `__thiscall` puts
non-this args on the stack (`[esp+N]`), not in arbitrary registers.
All 12 BP5 hits printed `Nx=0 Ny=0 ...` — confirming the read
addresses were wrong. **Fix:** type dumper + double-indirect via
`dwo(poi(@esp+N)+offset)`.
2. **v2: BP6 symbol wrong + PowerShell UTF-16 encoding.** v1's
`validate_walkable` doesn't exist in the PDB (the actual function is
`CTransition::check_walkable`). PowerShell's `Tee-Object` writes
UTF-16 LE by default, making logs ungreppable. **Fixes:** BP6 symbol
corrected, runner switched to `Out-File -Encoding ASCII`. v2 had
correct integer reads (substeps=3, insertType=0) but all `%f` floats
still printed as 0.000000.
3. **v3: `%f` doesn't work with `dwo()`.** Switching to
`@@c++(*(float*)addr)` to force C++ interpretation also produced
0.000000 across all float fields. cdb's `.printf %f` appears to not
reliably handle our float values (possibly varargs promotion, possibly
a deeper limitation). **Workaround (v4):** print all floats as 32-bit
hex bits via `%08X`; Python decoder reinterprets via
`struct.unpack('<f', struct.pack('<I', value))`.
The v4 + decoder pattern works. **Pickup sessions should NOT change
the cdb script** unless adding new BPs. The hex-bits encoding is robust
and the decoder validates against known constants (BP6 threshold = FloorZ).
## Scen1 findings (preliminary — formal A6.P2 to follow)
### Capture pair
- Retail: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log` (raw v4 hex) + `retail.decoded.log` (decoded floats).
- Acdream: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log` (84,130 lines).
### BP hit-count distribution (2-sec walk through inn doorway, both clients)
| Site | Retail | Acdream | Ratio (acdream/retail) |
|---|---:|---:|---:|
| transitional_insert / sub-step loop | 7,686 (BP1) | n/a (no acdream probe) | — |
| find_collisions dispatch | 5,818 (BP4) | 295 ([push-back-disp]) | **0.05× (20× fewer)** |
| adjust_sphere_to_plane | 12 (BP5) | 8 ([push-back]) | 0.67× |
| check_other_cells loop | n/a (BP3 zero — no wall hit) | 5 ([push-back-cell]) | — |
| check_walkable / ground verdict | 12 (BP6) | n/a (no acdream probe) | — |
| set_contact_plane / CP writes | 18 (BP7 fn calls) | 73,304 (per-field) | **~1001000× more** |
| step_up | 1 (BP2) | n/a | — |
| set_collide / wall halt | 0 (no wall hit in scen1) | n/a | — |
### Finding 1: dispatcher entry frequency mismatch
Retail's `BSPTREE::find_collisions` fires 5,818 times in ~2 seconds of
walking (~2,900/sec). Acdream's `BSPQuery.FindCollisions` fires only
295 times in the same scenario (~150/sec).
**Possible causes** (investigate during A6.P2):
- Physics tick rate difference (retail 30Hz? per CLAUDE.md
steep-roof finding) vs acdream's tick.
- Different sub-step cadence inside `transitional_insert`
retail's outer loop iterates much more than ours.
- Different number of cells visited per sub-step (retail's CELLARRAY
iteration calls dispatcher per cell; we may only call once
per primary cell).
- Probe scope difference: retail's BP catches `BSPTREE::find_collisions`
(one C++ class). Acdream's `[push-back-disp]` covers
`BSPQuery.FindCollisions` modern overload (one C# method). If our
call paths into dispatcher are differently structured, frequencies
diverge.
### Finding 2: ContactPlane write blowup
Acdream writes 73,304 ContactPlane field-level updates in 30 seconds
(~2,400/sec including the boot phase before the player moved).
Retail's `set_contact_plane` fires 18 times (~6/sec including boot).
Even with a 6× field-write multiplier per `set_contact_plane` call,
that gives ~100 actual CP updates in retail vs ~12,000 in acdream
**100×+ more frequent in acdream**.
**This is the M1.5 hypothesis confirmed empirically.** Per the spec
§1.2, the working hypothesis was that `FindEnvCollisions` indoor
branch rewrites CP every frame instead of retaining it via the three
documented retention mechanisms. The 73K cp-write data confirms.
A6.P3 fix surface: stop rewriting CP every frame; use the existing
LKCP-restore (Mechanism B at `validate_transition`) + Path-6 land
write (Mechanism A) + post-OK step-down probe (Mechanism C).
`TryFindIndoorWalkablePlane` synthesis (the workaround flagged for
A6.P4 removal) is part of the same bad-pattern family.
### Per-call shape match (BP5 hit#1)
| Field | Retail (decoded) | Acdream | Match? |
|---|---|---|---|
| Plane.N | (0, 0, 1) | (0, 0, 1) | ✓ identical |
| Plane.d | -0.0000 | -0.0000 | ✓ identical |
| Sphere.center.x | 0.0046 | -0.4325 | independent walks |
| Sphere.center.y | 10.3072 | 11.0219 | independent walks |
| Sphere.center.z | -0.2700 | 0.4600 | DIFFERENT axis — investigate |
| Sphere.radius | 0.4800 | 0.4800 | ✓ identical |
| WalkInterp (pre) | 1.0000 | 1.0000 | ✓ identical |
| Movement.x | 0.0000 | 0.0000 | ✓ identical |
| Movement.y | -0.0000 | -0.0000 | ✓ identical |
| Movement.z | -0.7500 | -0.5000 | DIFFERENT — investigate |
The shape matches (vertical step-down probe against ground), but two
axis values differ between retail and acdream:
- **Sphere.center.z**: retail -0.27, acdream +0.46. Could be different
local-space conventions (retail's localspace_pos vs acdream's
per-cell transform).
- **Movement.z**: retail -0.75 (the value passed by the call site
in retail's decomp), acdream -0.50 (smaller step-down probe distance).
These could be the BSP correction-path divergence the spec hypothesizes,
or they could be benign convention differences. A6.P2 with the full 9
scenarios will surface which.
## What's deferred (scenarios 29 + A6.P2)
### Scenarios 29 (~40 min user time at ~5 min each)
| # | Tag | Location | Walk script |
|---|---|---|---|
| 2 | scen2_inn_stairs | Holtburg inn, stairs to 2nd floor | Walk up 4 steps, stop on landing |
| 3 | scen3_inn_2nd_floor | Holtburg inn 2nd floor | Walk forward 3 m, sidestep 1 m, walk back |
| 4 | scen4_cottage_cellar | Holtburg cottage with cellar | Walk to cellar opening, descend 2 steps |
| 5 | scen5_sewer_entry | Holtburg sewer entrance | Walk into portal, then walk 2 m forward inside |
| 6 | scen6_sewer_first_stair | Sewer's first stair after entry | Walk down full stair flight |
| 7 | scen7_sewer_inter_room | Between any two sewer rooms via portal | Walk through portal, stop 1 m past |
| 8 | scen8_sewer_chamber | Sewer's multi-Z room | Walk in, traverse center, walk out other side |
| 9 | scen9_sewer_corridor | Sewer narrow corridor | Walk full length end-to-end |
Per-scenario protocol (validated by scen1):
1. User launches retail, navigates character to start point, stops.
2. Run `.\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."`.
Wait for `a6-probe v4 armed:` confirmation in
`docs/research/2026-05-21-a6-captures/scenN_.../retail.log`.
3. User performs the scripted walk in retail.
4. cdb auto-detaches at 50K hits (or kill cdb to release retail —
acclient comes down too, accept and relaunch).
5. User launches acdream with all 5 probe env vars
(`ACDREAM_PROBE_PUSH_BACK=1` + indoor_bsp + cell + cell_cache + contact_plane).
Output to `docs/research/2026-05-21-a6-captures/scenN_.../acdream.log`.
6. User walks acdream through the SAME scripted walk.
7. Close acdream gracefully.
8. Run `py tools/cdb/decode_retail_hex.py docs/research/2026-05-21-a6-captures/scenN_.../retail.log`.
9. Commit `retail.log`, `acdream.log`, `retail.decoded.log` for that scenario.
### A6.P2 (analysis report) — ~1 day after all 9 scenarios are in
Spec §4 of the design doc defines the 4 mandatory tables:
1. Per-site push-back delta (Table 1)
2. Path-frequency diff (Table 2)
3. ContactPlane lifecycle diff (Table 3)
4. Sub-step state mutations (Table 4)
Plus per-scenario narrative + findings section.
**Already have strong evidence for Finding 2 (CP-write blowup)** from
scen1 alone. A6.P2 quantifies + extends across the remaining 8
scenarios + writes the formal A6.P3 fix sketches.
## Known issues + gotchas (lessons from today)
1. **Killing cdb kills retail** (per CLAUDE.md). Either wait for 50K
threshold via `qd` auto-detach (~60 sec under motion) or accept that
killing cdb takes acclient down too. Relaunch is ~30 sec.
2. **PowerShell `Tee-Object` writes UTF-16 LE.** The runner uses
`Out-File -Encoding ASCII` to fix this. Don't revert.
3. **cdb `.printf %f` is unreliable.** v4 uses hex output + Python
decoder. Do NOT try to "simplify" back to `%f`.
4. **Retail binary must match the PDB** (GUID `{9e847e2f-...}`,
linker UTC `2013-09-06`). Verify with
`py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"`
before any capture session.
5. **Hit-rate budget under motion.** ~13K total hits per 2-sec walk.
Threshold of 50K survives ~8 sec of continuous walking before
auto-detach. For longer scenarios (sewer corridor end-to-end),
the walk may need to be broken into multiple captures OR threshold
bumped to 100K (edit `a6-probe.cdb` `.if (@$t0 >= 50000)``100000`).
6. **BP6 fires with FloorZ (0.6642) not cos85 (0.0872).** v4 confirmed
this — `check_walkable` is called with `PhysicsGlobals.FloorZ` for
ground verdicts. The cos85 value (0.0872) is passed in a different
code path (post-set_collide wall-slide) which didn't fire during
scen1 (no wall hits). Will appear when scenarios 29 hit walls.
## Pickup prompt for fresh session
Open a new Claude Code session at this worktree's branch
(`claude/strange-albattani-3fc83c`, HEAD at the latest A6.P1 commit).
Then paste:
---
```
Pick up A6.P1 capture work — scenarios 2 through 9. The infrastructure
shipped today (probe + cdb v4 + decoder all working). Scenario 1 captured
end-to-end with paired retail + acdream traces; preliminary findings
already strong (CP-write blowup confirms the M1.5 hypothesis).
Read FIRST:
docs/research/2026-05-21-a6-p1-partial-ship-handoff.md
Then state both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P1 — capture scenarios 2-9
Next concrete step: scenario 2 (Holtburg inn stairs)
Workflow per scenario (validated by scen1):
1. Verify retail binary matches PDB:
py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"
Expect MATCH (GUID {9e847e2f-...}).
2. User launches retail, walks character to scenario start, stops.
3. .\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."
Wait for "a6-probe v4 armed:" in the log file.
4. User performs the scripted walk.
5. Wait for cdb auto-detach (50K hits) OR kill cdb (acclient dies too;
relaunch needed). Hit rate ~6.5K/sec under motion.
6. User launches acdream with all 5 probe env vars + output to
docs/research/2026-05-21-a6-captures/scenN_.../acdream.log
7. User walks acdream through the SAME scripted walk.
8. Close acdream gracefully.
9. py tools/cdb/decode_retail_hex.py docs/research/.../retail.log
10. Commit retail.log + retail.decoded.log + acdream.log for that scenario.
Scenario list per the README at tools/cdb/README-a6-probe.md.
DO NOT modify the cdb script. v4 works (verified by BP6 threshold
decoding to FloorZ 0.6642 exactly). The hex-bits + Python decoder
pattern is the stable approach.
CLAUDE.md rules apply:
- Three failed visual verifications = handoff (we hit this on the
cdb script v1→v2→v3 cycle; v4 broke the streak).
- No workarounds without approval (v4 hex output isn't a workaround,
it's the chosen design after cdb %f proved unreliable).
- Visual verification at the Holtburg Sewer is the M1.5 physics
acceptance test (deferred to A6.P4 after fixes land).
After all 9 captures: proceed to A6.P2 analysis per the design spec
docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md
§4. Note: Finding 2 (CP-write blowup) is already evidence-confirmed
from scen1; A6.P2 just needs to quantify + extend across scenarios.
```
---
## References
- Design spec: [`docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md`](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md)
- Implementation plan: [`docs/superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md`](../superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md)
- cdb script: [`tools/cdb/a6-probe.cdb`](../../tools/cdb/a6-probe.cdb) (v4)
- cdb runner: [`tools/cdb/a6-probe-runner.ps1`](../../tools/cdb/a6-probe-runner.ps1)
- Type dumper: [`tools/cdb/a6-types-dump.cdb`](../../tools/cdb/a6-types-dump.cdb) + [`a6-types-dump.txt`](../../tools/cdb/a6-types-dump.txt) (PDB-extracted offsets)
- Hex decoder: [`tools/cdb/decode_retail_hex.py`](../../tools/cdb/decode_retail_hex.py)
- Scen1 retail: [`docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log`](2026-05-21-a6-captures/scen1_inn_doorway/retail.log) + [`retail.decoded.log`](2026-05-21-a6-captures/scen1_inn_doorway/retail.decoded.log)
- Scen1 acdream: [`docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log`](2026-05-21-a6-captures/scen1_inn_doorway/acdream.log)
- Findings doc stub (to be filled by A6.P2): [`docs/research/2026-05-21-a6-cdb-capture-findings.md`](2026-05-21-a6-cdb-capture-findings.md)