docs(handoff): A6.P1 partial-ship — infra DONE, captures 1/9

Pickup-prompt + lessons doc for the A6.P1 capture work. Documents:

- The 16 commits shipping today (infrastructure Tasks 1-14 + cdb
  script v1→v4 iteration + scen1 capture + decoder).
- WHY cdb iterated 4 times: v1 wrong offsets, v2 PowerShell UTF-16,
  v3 cdb %f unreliable with dwo()/@@c++, v4 hex output works.
- Scen1 findings already strong: dispatcher entry frequency mismatch
  (acdream 20× fewer than retail) + ContactPlane write blowup
  (~100-1000× more frequent in acdream) — directly confirms the
  spec's M1.5 hypothesis about per-frame CP resynthesis.
- Per-scenario protocol validated by scen1.
- Pasteable session-start prompt for picking up scenarios 2-9.
- Known issues (kill-cdb-kills-retail, .printf %f unreliable, etc).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-21 20:05:17 +02:00
parent 194ed3ef21
commit 2f2b63f8bd

View file

@ -0,0 +1,344 @@
# A6.P1 partial-ship handoff — 2026-05-21
**Status:** Infrastructure complete + scenario 1 (Holtburg inn doorway)
captured end-to-end (retail + acdream paired). Scenarios 29 deferred to
next session.
**Pasteable session-start prompt at the bottom of this doc.**
## TL;DR
A6.P1 ships in two milestones:
1. **Infrastructure milestone (DONE today):** `[push-back]` acdream probe (3
helpers + 3 sites + DebugVM mirror + CLAUDE.md docs), cdb probe script
(v4 with PDB-verified offsets + hex-bits float output), PowerShell
runner with ASCII encoding, README, capture-dir scaffolding,
PDB-match verification, type dumper, hex→float decoder.
2. **Capture milestone (PARTIAL):** 1 of 9 scenarios captured. Scenarios
29 user-driven, deferred at user direction to avoid fatigue.
**Scenario 1 already surfaces two strong M1.5 findings** (before any
formal A6.P2 analysis):
| Metric | Retail | acdream | Notes |
|---|---:|---:|---|
| dispatcher entries (find_collisions / BSPQuery.FindCollisions) | 5,818 | 295 | acdream calls dispatcher **20× less often** |
| ContactPlane writes (set_contact_plane fn / per-field writes) | 18 calls | **73,304** field-writes | acdream **rewrites CP every frame/sub-step** vs retail's per-event |
The CP-write blowup directly confirms the spec's hypothesis
([2026-05-21-phase-a6-indoor-physics-fidelity-design.md §1.2](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md))
that `FindEnvCollisions` indoor branch resynthesizes CP per frame
instead of retaining via Mechanisms A/B/C. Same family as the
`TryFindIndoorWalkablePlane` workaround.
## State both altitudes (next session)
> **Currently working toward: M1.5 — "Indoor world feels right."**
>
> **Current phase: A6 — Indoor physics fidelity (cdb-driven).**
>
> **Next concrete step: Capture scenarios 29 (paired retail + acdream
> traces). Then run A6.P2 analysis on all 9 captures.**
## What shipped today (16 commits)
### Infrastructure (Tasks 114 from the A6.P1 plan)
| Commit | What |
|---|---|
| `ace9e62`, `ad6c89d` | T1: `ProbePushBackEnabled` toggle + roundtrip test |
| `3a173b9` | T2: `LogPushBackAdjust` helper |
| `eb8a318` | T3: instrument `BSPQuery.AdjustSphereToPlane` |
| `2d1f27d` | T4: `LogPushBackDispatch` helper |
| `35631d1` | T5: instrument `BSPQuery.FindCollisions` |
| `66ee757` | T6: `LogPushBackCellTransit` helper |
| `642734d` | T7: instrument `Transition.CheckOtherCells` |
| `dd95c10` | T8: DebugVM `ProbePushBack` mirror |
| `e1f7efe` | T9: CLAUDE.md `ACDREAM_PROBE_PUSH_BACK` env var docs |
| `7bb799b` | T10: `tools/cdb/a6-probe.cdb` v1 (broken offsets) |
| `1c640eb` | T11: `tools/cdb/a6-probe-runner.ps1` (later patched to ASCII) |
| `df315a9` | T12: `tools/cdb/README-a6-probe.md` |
| `0e21f22`, `22e341f` | T13: PDB-match verification (audit trail) |
| `260c60f` | T14: capture-dir scaffolding + findings doc stub |
### cdb script iteration (T15 dry-runs)
| Commit | What |
|---|---|
| `d0c8c54` | v1→v2 prep: type dumper (`a6-types-dump.cdb` + runner) + ASCII runner |
| `7b9b26f` | v2 cdb script: PDB-verified offsets + BP6 fix to `check_walkable` |
| `1b6d49e` | v3 cdb script: `@@c++(*(float*)addr)` for floats (still produced zeros) |
| `2d841cb` | v4 cdb script: hex-bits float output via `%08X` (WORKS) |
### Scen1 capture + decode tooling
| Commit | What |
|---|---|
| `180b4a5` | scen1 retail.log captured (v4 cdb, 13,552 hits, real hex bits) |
| `8ca718a` | scen1 acdream.log paired (84,130 lines, full probe distribution) |
| `194ed3e` | `decode_retail_hex.py` — Python hex→float decoder + scen1 decoded log |
## Why cdb v1→v4 iteration was necessary
The cdb side hit three landmines we didn't anticipate when writing the
A6.P1 plan:
1. **v1: Stack-arg offsets wrong.** Plan's probe actions used arbitrary
registers (`@edx`, `@edi`) to read function args. `__thiscall` puts
non-this args on the stack (`[esp+N]`), not in arbitrary registers.
All 12 BP5 hits printed `Nx=0 Ny=0 ...` — confirming the read
addresses were wrong. **Fix:** type dumper + double-indirect via
`dwo(poi(@esp+N)+offset)`.
2. **v2: BP6 symbol wrong + PowerShell UTF-16 encoding.** v1's
`validate_walkable` doesn't exist in the PDB (the actual function is
`CTransition::check_walkable`). PowerShell's `Tee-Object` writes
UTF-16 LE by default, making logs ungreppable. **Fixes:** BP6 symbol
corrected, runner switched to `Out-File -Encoding ASCII`. v2 had
correct integer reads (substeps=3, insertType=0) but all `%f` floats
still printed as 0.000000.
3. **v3: `%f` doesn't work with `dwo()`.** Switching to
`@@c++(*(float*)addr)` to force C++ interpretation also produced
0.000000 across all float fields. cdb's `.printf %f` appears to not
reliably handle our float values (possibly varargs promotion, possibly
a deeper limitation). **Workaround (v4):** print all floats as 32-bit
hex bits via `%08X`; Python decoder reinterprets via
`struct.unpack('<f', struct.pack('<I', value))`.
The v4 + decoder pattern works. **Pickup sessions should NOT change
the cdb script** unless adding new BPs. The hex-bits encoding is robust
and the decoder validates against known constants (BP6 threshold = FloorZ).
## Scen1 findings (preliminary — formal A6.P2 to follow)
### Capture pair
- Retail: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log` (raw v4 hex) + `retail.decoded.log` (decoded floats).
- Acdream: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log` (84,130 lines).
### BP hit-count distribution (2-sec walk through inn doorway, both clients)
| Site | Retail | Acdream | Ratio (acdream/retail) |
|---|---:|---:|---:|
| transitional_insert / sub-step loop | 7,686 (BP1) | n/a (no acdream probe) | — |
| find_collisions dispatch | 5,818 (BP4) | 295 ([push-back-disp]) | **0.05× (20× fewer)** |
| adjust_sphere_to_plane | 12 (BP5) | 8 ([push-back]) | 0.67× |
| check_other_cells loop | n/a (BP3 zero — no wall hit) | 5 ([push-back-cell]) | — |
| check_walkable / ground verdict | 12 (BP6) | n/a (no acdream probe) | — |
| set_contact_plane / CP writes | 18 (BP7 fn calls) | 73,304 (per-field) | **~1001000× more** |
| step_up | 1 (BP2) | n/a | — |
| set_collide / wall halt | 0 (no wall hit in scen1) | n/a | — |
### Finding 1: dispatcher entry frequency mismatch
Retail's `BSPTREE::find_collisions` fires 5,818 times in ~2 seconds of
walking (~2,900/sec). Acdream's `BSPQuery.FindCollisions` fires only
295 times in the same scenario (~150/sec).
**Possible causes** (investigate during A6.P2):
- Physics tick rate difference (retail 30Hz? per CLAUDE.md
steep-roof finding) vs acdream's tick.
- Different sub-step cadence inside `transitional_insert`
retail's outer loop iterates much more than ours.
- Different number of cells visited per sub-step (retail's CELLARRAY
iteration calls dispatcher per cell; we may only call once
per primary cell).
- Probe scope difference: retail's BP catches `BSPTREE::find_collisions`
(one C++ class). Acdream's `[push-back-disp]` covers
`BSPQuery.FindCollisions` modern overload (one C# method). If our
call paths into dispatcher are differently structured, frequencies
diverge.
### Finding 2: ContactPlane write blowup
Acdream writes 73,304 ContactPlane field-level updates in 30 seconds
(~2,400/sec including the boot phase before the player moved).
Retail's `set_contact_plane` fires 18 times (~6/sec including boot).
Even with a 6× field-write multiplier per `set_contact_plane` call,
that gives ~100 actual CP updates in retail vs ~12,000 in acdream
**100×+ more frequent in acdream**.
**This is the M1.5 hypothesis confirmed empirically.** Per the spec
§1.2, the working hypothesis was that `FindEnvCollisions` indoor
branch rewrites CP every frame instead of retaining it via the three
documented retention mechanisms. The 73K cp-write data confirms.
A6.P3 fix surface: stop rewriting CP every frame; use the existing
LKCP-restore (Mechanism B at `validate_transition`) + Path-6 land
write (Mechanism A) + post-OK step-down probe (Mechanism C).
`TryFindIndoorWalkablePlane` synthesis (the workaround flagged for
A6.P4 removal) is part of the same bad-pattern family.
### Per-call shape match (BP5 hit#1)
| Field | Retail (decoded) | Acdream | Match? |
|---|---|---|---|
| Plane.N | (0, 0, 1) | (0, 0, 1) | ✓ identical |
| Plane.d | -0.0000 | -0.0000 | ✓ identical |
| Sphere.center.x | 0.0046 | -0.4325 | independent walks |
| Sphere.center.y | 10.3072 | 11.0219 | independent walks |
| Sphere.center.z | -0.2700 | 0.4600 | DIFFERENT axis — investigate |
| Sphere.radius | 0.4800 | 0.4800 | ✓ identical |
| WalkInterp (pre) | 1.0000 | 1.0000 | ✓ identical |
| Movement.x | 0.0000 | 0.0000 | ✓ identical |
| Movement.y | -0.0000 | -0.0000 | ✓ identical |
| Movement.z | -0.7500 | -0.5000 | DIFFERENT — investigate |
The shape matches (vertical step-down probe against ground), but two
axis values differ between retail and acdream:
- **Sphere.center.z**: retail -0.27, acdream +0.46. Could be different
local-space conventions (retail's localspace_pos vs acdream's
per-cell transform).
- **Movement.z**: retail -0.75 (the value passed by the call site
in retail's decomp), acdream -0.50 (smaller step-down probe distance).
These could be the BSP correction-path divergence the spec hypothesizes,
or they could be benign convention differences. A6.P2 with the full 9
scenarios will surface which.
## What's deferred (scenarios 29 + A6.P2)
### Scenarios 29 (~40 min user time at ~5 min each)
| # | Tag | Location | Walk script |
|---|---|---|---|
| 2 | scen2_inn_stairs | Holtburg inn, stairs to 2nd floor | Walk up 4 steps, stop on landing |
| 3 | scen3_inn_2nd_floor | Holtburg inn 2nd floor | Walk forward 3 m, sidestep 1 m, walk back |
| 4 | scen4_cottage_cellar | Holtburg cottage with cellar | Walk to cellar opening, descend 2 steps |
| 5 | scen5_sewer_entry | Holtburg sewer entrance | Walk into portal, then walk 2 m forward inside |
| 6 | scen6_sewer_first_stair | Sewer's first stair after entry | Walk down full stair flight |
| 7 | scen7_sewer_inter_room | Between any two sewer rooms via portal | Walk through portal, stop 1 m past |
| 8 | scen8_sewer_chamber | Sewer's multi-Z room | Walk in, traverse center, walk out other side |
| 9 | scen9_sewer_corridor | Sewer narrow corridor | Walk full length end-to-end |
Per-scenario protocol (validated by scen1):
1. User launches retail, navigates character to start point, stops.
2. Run `.\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."`.
Wait for `a6-probe v4 armed:` confirmation in
`docs/research/2026-05-21-a6-captures/scenN_.../retail.log`.
3. User performs the scripted walk in retail.
4. cdb auto-detaches at 50K hits (or kill cdb to release retail —
acclient comes down too, accept and relaunch).
5. User launches acdream with all 5 probe env vars
(`ACDREAM_PROBE_PUSH_BACK=1` + indoor_bsp + cell + cell_cache + contact_plane).
Output to `docs/research/2026-05-21-a6-captures/scenN_.../acdream.log`.
6. User walks acdream through the SAME scripted walk.
7. Close acdream gracefully.
8. Run `py tools/cdb/decode_retail_hex.py docs/research/2026-05-21-a6-captures/scenN_.../retail.log`.
9. Commit `retail.log`, `acdream.log`, `retail.decoded.log` for that scenario.
### A6.P2 (analysis report) — ~1 day after all 9 scenarios are in
Spec §4 of the design doc defines the 4 mandatory tables:
1. Per-site push-back delta (Table 1)
2. Path-frequency diff (Table 2)
3. ContactPlane lifecycle diff (Table 3)
4. Sub-step state mutations (Table 4)
Plus per-scenario narrative + findings section.
**Already have strong evidence for Finding 2 (CP-write blowup)** from
scen1 alone. A6.P2 quantifies + extends across the remaining 8
scenarios + writes the formal A6.P3 fix sketches.
## Known issues + gotchas (lessons from today)
1. **Killing cdb kills retail** (per CLAUDE.md). Either wait for 50K
threshold via `qd` auto-detach (~60 sec under motion) or accept that
killing cdb takes acclient down too. Relaunch is ~30 sec.
2. **PowerShell `Tee-Object` writes UTF-16 LE.** The runner uses
`Out-File -Encoding ASCII` to fix this. Don't revert.
3. **cdb `.printf %f` is unreliable.** v4 uses hex output + Python
decoder. Do NOT try to "simplify" back to `%f`.
4. **Retail binary must match the PDB** (GUID `{9e847e2f-...}`,
linker UTC `2013-09-06`). Verify with
`py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"`
before any capture session.
5. **Hit-rate budget under motion.** ~13K total hits per 2-sec walk.
Threshold of 50K survives ~8 sec of continuous walking before
auto-detach. For longer scenarios (sewer corridor end-to-end),
the walk may need to be broken into multiple captures OR threshold
bumped to 100K (edit `a6-probe.cdb` `.if (@$t0 >= 50000)``100000`).
6. **BP6 fires with FloorZ (0.6642) not cos85 (0.0872).** v4 confirmed
this — `check_walkable` is called with `PhysicsGlobals.FloorZ` for
ground verdicts. The cos85 value (0.0872) is passed in a different
code path (post-set_collide wall-slide) which didn't fire during
scen1 (no wall hits). Will appear when scenarios 29 hit walls.
## Pickup prompt for fresh session
Open a new Claude Code session at this worktree's branch
(`claude/strange-albattani-3fc83c`, HEAD at the latest A6.P1 commit).
Then paste:
---
```
Pick up A6.P1 capture work — scenarios 2 through 9. The infrastructure
shipped today (probe + cdb v4 + decoder all working). Scenario 1 captured
end-to-end with paired retail + acdream traces; preliminary findings
already strong (CP-write blowup confirms the M1.5 hypothesis).
Read FIRST:
docs/research/2026-05-21-a6-p1-partial-ship-handoff.md
Then state both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P1 — capture scenarios 2-9
Next concrete step: scenario 2 (Holtburg inn stairs)
Workflow per scenario (validated by scen1):
1. Verify retail binary matches PDB:
py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"
Expect MATCH (GUID {9e847e2f-...}).
2. User launches retail, walks character to scenario start, stops.
3. .\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."
Wait for "a6-probe v4 armed:" in the log file.
4. User performs the scripted walk.
5. Wait for cdb auto-detach (50K hits) OR kill cdb (acclient dies too;
relaunch needed). Hit rate ~6.5K/sec under motion.
6. User launches acdream with all 5 probe env vars + output to
docs/research/2026-05-21-a6-captures/scenN_.../acdream.log
7. User walks acdream through the SAME scripted walk.
8. Close acdream gracefully.
9. py tools/cdb/decode_retail_hex.py docs/research/.../retail.log
10. Commit retail.log + retail.decoded.log + acdream.log for that scenario.
Scenario list per the README at tools/cdb/README-a6-probe.md.
DO NOT modify the cdb script. v4 works (verified by BP6 threshold
decoding to FloorZ 0.6642 exactly). The hex-bits + Python decoder
pattern is the stable approach.
CLAUDE.md rules apply:
- Three failed visual verifications = handoff (we hit this on the
cdb script v1→v2→v3 cycle; v4 broke the streak).
- No workarounds without approval (v4 hex output isn't a workaround,
it's the chosen design after cdb %f proved unreliable).
- Visual verification at the Holtburg Sewer is the M1.5 physics
acceptance test (deferred to A6.P4 after fixes land).
After all 9 captures: proceed to A6.P2 analysis per the design spec
docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md
§4. Note: Finding 2 (CP-write blowup) is already evidence-confirmed
from scen1; A6.P2 just needs to quantify + extend across scenarios.
```
---
## References
- Design spec: [`docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md`](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md)
- Implementation plan: [`docs/superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md`](../superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md)
- cdb script: [`tools/cdb/a6-probe.cdb`](../../tools/cdb/a6-probe.cdb) (v4)
- cdb runner: [`tools/cdb/a6-probe-runner.ps1`](../../tools/cdb/a6-probe-runner.ps1)
- Type dumper: [`tools/cdb/a6-types-dump.cdb`](../../tools/cdb/a6-types-dump.cdb) + [`a6-types-dump.txt`](../../tools/cdb/a6-types-dump.txt) (PDB-extracted offsets)
- Hex decoder: [`tools/cdb/decode_retail_hex.py`](../../tools/cdb/decode_retail_hex.py)
- Scen1 retail: [`docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log`](2026-05-21-a6-captures/scen1_inn_doorway/retail.log) + [`retail.decoded.log`](2026-05-21-a6-captures/scen1_inn_doorway/retail.decoded.log)
- Scen1 acdream: [`docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log`](2026-05-21-a6-captures/scen1_inn_doorway/acdream.log)
- Findings doc stub (to be filled by A6.P2): [`docs/research/2026-05-21-a6-cdb-capture-findings.md`](2026-05-21-a6-cdb-capture-findings.md)