Erik 2f2b63f8bd docs(handoff): A6.P1 partial-ship — infra DONE, captures 1/9

Pickup-prompt + lessons doc for the A6.P1 capture work. Documents:

- The 16 commits shipping today (infrastructure Tasks 1-14 + cdb
  script v1→v4 iteration + scen1 capture + decoder).
- WHY cdb iterated 4 times: v1 wrong offsets, v2 PowerShell UTF-16,
  v3 cdb %f unreliable with dwo()/@@c++, v4 hex output works.
- Scen1 findings already strong: dispatcher entry frequency mismatch
  (acdream 20× fewer than retail) + ContactPlane write blowup
  (~100-1000× more frequent in acdream) — directly confirms the
  spec's M1.5 hypothesis about per-frame CP resynthesis.
- Per-scenario protocol validated by scen1.
- Pasteable session-start prompt for picking up scenarios 2-9.
- Known issues (kill-cdb-kills-retail, .printf %f unreliable, etc).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-21 20:05:17 +02:00

17 KiB

Raw Blame History

A6.P1 partial-ship handoff — 2026-05-21

Status: Infrastructure complete + scenario 1 (Holtburg inn doorway) captured end-to-end (retail + acdream paired). Scenarios 2–9 deferred to next session.

Pasteable session-start prompt at the bottom of this doc.

TL;DR

A6.P1 ships in two milestones:

Infrastructure milestone (DONE today): [push-back] acdream probe (3 helpers + 3 sites + DebugVM mirror + CLAUDE.md docs), cdb probe script (v4 with PDB-verified offsets + hex-bits float output), PowerShell runner with ASCII encoding, README, capture-dir scaffolding, PDB-match verification, type dumper, hex→float decoder.
Capture milestone (PARTIAL): 1 of 9 scenarios captured. Scenarios 2–9 user-driven, deferred at user direction to avoid fatigue.

Scenario 1 already surfaces two strong M1.5 findings (before any formal A6.P2 analysis):

Metric	Retail	acdream	Notes
dispatcher entries (find_collisions / BSPQuery.FindCollisions)	5,818	295	acdream calls dispatcher 20× less often
ContactPlane writes (set_contact_plane fn / per-field writes)	18 calls	73,304 field-writes	acdream rewrites CP every frame/sub-step vs retail's per-event

The CP-write blowup directly confirms the spec's hypothesis (2026-05-21-phase-a6-indoor-physics-fidelity-design.md §1.2) that FindEnvCollisions indoor branch resynthesizes CP per frame instead of retaining via Mechanisms A/B/C. Same family as the TryFindIndoorWalkablePlane workaround.

State both altitudes (next session)

Currently working toward: M1.5 — "Indoor world feels right."

Current phase: A6 — Indoor physics fidelity (cdb-driven).

Next concrete step: Capture scenarios 2–9 (paired retail + acdream traces). Then run A6.P2 analysis on all 9 captures.

What shipped today (16 commits)

Infrastructure (Tasks 1–14 from the A6.P1 plan)

Commit	What
`ace9e62`, `ad6c89d`	T1: `ProbePushBackEnabled` toggle + roundtrip test
`3a173b9`	T2: `LogPushBackAdjust` helper
`eb8a318`	T3: instrument `BSPQuery.AdjustSphereToPlane`
`2d1f27d`	T4: `LogPushBackDispatch` helper
`35631d1`	T5: instrument `BSPQuery.FindCollisions`
`66ee757`	T6: `LogPushBackCellTransit` helper
`642734d`	T7: instrument `Transition.CheckOtherCells`
`dd95c10`	T8: DebugVM `ProbePushBack` mirror
`e1f7efe`	T9: CLAUDE.md `ACDREAM_PROBE_PUSH_BACK` env var docs
`7bb799b`	T10: `tools/cdb/a6-probe.cdb` v1 (broken offsets)
`1c640eb`	T11: `tools/cdb/a6-probe-runner.ps1` (later patched to ASCII)
`df315a9`	T12: `tools/cdb/README-a6-probe.md`
`0e21f22`, `22e341f`	T13: PDB-match verification (audit trail)
`260c60f`	T14: capture-dir scaffolding + findings doc stub

cdb script iteration (T15 dry-runs)

Commit	What
`d0c8c54`	v1→v2 prep: type dumper (`a6-types-dump.cdb` + runner) + ASCII runner
`7b9b26f`	v2 cdb script: PDB-verified offsets + BP6 fix to `check_walkable`
`1b6d49e`	v3 cdb script: `@@c++((float)addr)` for floats (still produced zeros)
`2d841cb`	v4 cdb script: hex-bits float output via `%08X` (WORKS)

Scen1 capture + decode tooling

Commit	What
`180b4a5`	scen1 retail.log captured (v4 cdb, 13,552 hits, real hex bits)
`8ca718a`	scen1 acdream.log paired (84,130 lines, full probe distribution)
`194ed3e`	`decode_retail_hex.py` — Python hex→float decoder + scen1 decoded log

Why cdb v1→v4 iteration was necessary

The cdb side hit three landmines we didn't anticipate when writing the A6.P1 plan:

v1: Stack-arg offsets wrong. Plan's probe actions used arbitrary registers (@edx, @edi) to read function args. __thiscall puts non-this args on the stack ([esp+N]), not in arbitrary registers. All 12 BP5 hits printed Nx=0 Ny=0 ... — confirming the read addresses were wrong. Fix: type dumper + double-indirect via dwo(poi(@esp+N)+offset).
v2: BP6 symbol wrong + PowerShell UTF-16 encoding. v1's validate_walkable doesn't exist in the PDB (the actual function is CTransition::check_walkable). PowerShell's Tee-Object writes UTF-16 LE by default, making logs ungreppable. Fixes: BP6 symbol corrected, runner switched to Out-File -Encoding ASCII. v2 had correct integer reads (substeps=3, insertType=0) but all %f floats still printed as 0.000000.
v3: %f doesn't work with dwo(). Switching to @@c++(*(float*)addr) to force C++ interpretation also produced 0.000000 across all float fields. cdb's .printf %f appears to not reliably handle our float values (possibly varargs promotion, possibly a deeper limitation). Workaround (v4): print all floats as 32-bit hex bits via %08X; Python decoder reinterprets via struct.unpack('<f', struct.pack('<I', value)).

The v4 + decoder pattern works. Pickup sessions should NOT change the cdb script unless adding new BPs. The hex-bits encoding is robust and the decoder validates against known constants (BP6 threshold = FloorZ).

Scen1 findings (preliminary — formal A6.P2 to follow)

Capture pair

Retail: docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log (raw v4 hex) + retail.decoded.log (decoded floats).
Acdream: docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log (84,130 lines).

BP hit-count distribution (2-sec walk through inn doorway, both clients)

Site	Retail	Acdream	Ratio (acdream/retail)
transitional_insert / sub-step loop	7,686 (BP1)	n/a (no acdream probe)	—
find_collisions dispatch	5,818 (BP4)	295 ([push-back-disp])	0.05× (20× fewer)
adjust_sphere_to_plane	12 (BP5)	8 ([push-back])	0.67×
check_other_cells loop	n/a (BP3 zero — no wall hit)	5 ([push-back-cell])	—
check_walkable / ground verdict	12 (BP6)	n/a (no acdream probe)	—
set_contact_plane / CP writes	18 (BP7 fn calls)	73,304 (per-field)	~100–1000× more
step_up	1 (BP2)	n/a	—
set_collide / wall halt	0 (no wall hit in scen1)	n/a	—

Finding 1: dispatcher entry frequency mismatch

Retail's BSPTREE::find_collisions fires 5,818 times in ~2 seconds of walking (~2,900/sec). Acdream's BSPQuery.FindCollisions fires only 295 times in the same scenario (~150/sec).

Possible causes (investigate during A6.P2):

Physics tick rate difference (retail 30Hz? per CLAUDE.md steep-roof finding) vs acdream's tick.
Different sub-step cadence inside transitional_insert — retail's outer loop iterates much more than ours.
Different number of cells visited per sub-step (retail's CELLARRAY iteration calls dispatcher per cell; we may only call once per primary cell).
Probe scope difference: retail's BP catches BSPTREE::find_collisions (one C++ class). Acdream's [push-back-disp] covers BSPQuery.FindCollisions modern overload (one C# method). If our call paths into dispatcher are differently structured, frequencies diverge.

Finding 2: ContactPlane write blowup

Acdream writes 73,304 ContactPlane field-level updates in 30 seconds (~2,400/sec including the boot phase before the player moved). Retail's set_contact_plane fires 18 times (~6/sec including boot). Even with a 6× field-write multiplier per set_contact_plane call, that gives ~100 actual CP updates in retail vs ~12,000 in acdream — 100×+ more frequent in acdream.

This is the M1.5 hypothesis confirmed empirically. Per the spec §1.2, the working hypothesis was that FindEnvCollisions indoor branch rewrites CP every frame instead of retaining it via the three documented retention mechanisms. The 73K cp-write data confirms.

A6.P3 fix surface: stop rewriting CP every frame; use the existing LKCP-restore (Mechanism B at validate_transition) + Path-6 land write (Mechanism A) + post-OK step-down probe (Mechanism C). TryFindIndoorWalkablePlane synthesis (the workaround flagged for A6.P4 removal) is part of the same bad-pattern family.

Per-call shape match (BP5 hit#1)

Field	Retail (decoded)	Acdream	Match?
Plane.N	(0, 0, 1)	(0, 0, 1)	✓ identical
Plane.d	-0.0000	-0.0000	✓ identical
Sphere.center.x	0.0046	-0.4325	independent walks
Sphere.center.y	10.3072	11.0219	independent walks
Sphere.center.z	-0.2700	0.4600	DIFFERENT axis — investigate
Sphere.radius	0.4800	0.4800	✓ identical
WalkInterp (pre)	1.0000	1.0000	✓ identical
Movement.x	0.0000	0.0000	✓ identical
Movement.y	-0.0000	-0.0000	✓ identical
Movement.z	-0.7500	-0.5000	DIFFERENT — investigate

The shape matches (vertical step-down probe against ground), but two axis values differ between retail and acdream:

Sphere.center.z: retail -0.27, acdream +0.46. Could be different local-space conventions (retail's localspace_pos vs acdream's per-cell transform).
Movement.z: retail -0.75 (the value passed by the call site in retail's decomp), acdream -0.50 (smaller step-down probe distance).

These could be the BSP correction-path divergence the spec hypothesizes, or they could be benign convention differences. A6.P2 with the full 9 scenarios will surface which.

What's deferred (scenarios 2–9 + A6.P2)

Scenarios 2–9 (~40 min user time at ~5 min each)

#	Tag	Location	Walk script
2	scen2_inn_stairs	Holtburg inn, stairs to 2nd floor	Walk up 4 steps, stop on landing
3	scen3_inn_2nd_floor	Holtburg inn 2nd floor	Walk forward 3 m, sidestep 1 m, walk back
4	scen4_cottage_cellar	Holtburg cottage with cellar	Walk to cellar opening, descend 2 steps
5	scen5_sewer_entry	Holtburg sewer entrance	Walk into portal, then walk 2 m forward inside
6	scen6_sewer_first_stair	Sewer's first stair after entry	Walk down full stair flight
7	scen7_sewer_inter_room	Between any two sewer rooms via portal	Walk through portal, stop 1 m past
8	scen8_sewer_chamber	Sewer's multi-Z room	Walk in, traverse center, walk out other side
9	scen9_sewer_corridor	Sewer narrow corridor	Walk full length end-to-end

Per-scenario protocol (validated by scen1):

User launches retail, navigates character to start point, stops.
Run .\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_...". Wait for a6-probe v4 armed: confirmation in docs/research/2026-05-21-a6-captures/scenN_.../retail.log.
User performs the scripted walk in retail.
cdb auto-detaches at 50K hits (or kill cdb to release retail — acclient comes down too, accept and relaunch).
User launches acdream with all 5 probe env vars (ACDREAM_PROBE_PUSH_BACK=1 + indoor_bsp + cell + cell_cache + contact_plane). Output to docs/research/2026-05-21-a6-captures/scenN_.../acdream.log.
User walks acdream through the SAME scripted walk.
Close acdream gracefully.
Run py tools/cdb/decode_retail_hex.py docs/research/2026-05-21-a6-captures/scenN_.../retail.log.
Commit retail.log, acdream.log, retail.decoded.log for that scenario.

A6.P2 (analysis report) — ~1 day after all 9 scenarios are in

Spec §4 of the design doc defines the 4 mandatory tables:

Per-site push-back delta (Table 1)
Path-frequency diff (Table 2)
ContactPlane lifecycle diff (Table 3)
Sub-step state mutations (Table 4)

Plus per-scenario narrative + findings section.

Already have strong evidence for Finding 2 (CP-write blowup) from scen1 alone. A6.P2 quantifies + extends across the remaining 8 scenarios + writes the formal A6.P3 fix sketches.

Known issues + gotchas (lessons from today)

Killing cdb kills retail (per CLAUDE.md). Either wait for 50K threshold via qd auto-detach (~60 sec under motion) or accept that killing cdb takes acclient down too. Relaunch is ~30 sec.
PowerShell Tee-Object writes UTF-16 LE. The runner uses Out-File -Encoding ASCII to fix this. Don't revert.
cdb .printf %f is unreliable. v4 uses hex output + Python decoder. Do NOT try to "simplify" back to %f.
Retail binary must match the PDB (GUID {9e847e2f-...}, linker UTC 2013-09-06). Verify with py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe" before any capture session.
Hit-rate budget under motion. ~13K total hits per 2-sec walk. Threshold of 50K survives ~8 sec of continuous walking before auto-detach. For longer scenarios (sewer corridor end-to-end), the walk may need to be broken into multiple captures OR threshold bumped to 100K (edit a6-probe.cdb .if (@$t0 >= 50000) → 100000).
BP6 fires with FloorZ (0.6642) not cos85 (0.0872). v4 confirmed this — check_walkable is called with PhysicsGlobals.FloorZ for ground verdicts. The cos85 value (0.0872) is passed in a different code path (post-set_collide wall-slide) which didn't fire during scen1 (no wall hits). Will appear when scenarios 2–9 hit walls.

Pickup prompt for fresh session

Open a new Claude Code session at this worktree's branch (claude/strange-albattani-3fc83c, HEAD at the latest A6.P1 commit). Then paste:

Pick up A6.P1 capture work — scenarios 2 through 9. The infrastructure
shipped today (probe + cdb v4 + decoder all working). Scenario 1 captured
end-to-end with paired retail + acdream traces; preliminary findings
already strong (CP-write blowup confirms the M1.5 hypothesis).

Read FIRST:
  docs/research/2026-05-21-a6-p1-partial-ship-handoff.md
Then state both altitudes:
  Currently working toward: M1.5 — Indoor world feels right
  Current phase: A6.P1 — capture scenarios 2-9
  Next concrete step: scenario 2 (Holtburg inn stairs)

Workflow per scenario (validated by scen1):
  1. Verify retail binary matches PDB:
     py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"
     Expect MATCH (GUID {9e847e2f-...}).
  2. User launches retail, walks character to scenario start, stops.
  3. .\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."
     Wait for "a6-probe v4 armed:" in the log file.
  4. User performs the scripted walk.
  5. Wait for cdb auto-detach (50K hits) OR kill cdb (acclient dies too;
     relaunch needed). Hit rate ~6.5K/sec under motion.
  6. User launches acdream with all 5 probe env vars + output to
     docs/research/2026-05-21-a6-captures/scenN_.../acdream.log
  7. User walks acdream through the SAME scripted walk.
  8. Close acdream gracefully.
  9. py tools/cdb/decode_retail_hex.py docs/research/.../retail.log
  10. Commit retail.log + retail.decoded.log + acdream.log for that scenario.

Scenario list per the README at tools/cdb/README-a6-probe.md.

DO NOT modify the cdb script. v4 works (verified by BP6 threshold
decoding to FloorZ 0.6642 exactly). The hex-bits + Python decoder
pattern is the stable approach.

CLAUDE.md rules apply:
  - Three failed visual verifications = handoff (we hit this on the
    cdb script v1→v2→v3 cycle; v4 broke the streak).
  - No workarounds without approval (v4 hex output isn't a workaround,
    it's the chosen design after cdb %f proved unreliable).
  - Visual verification at the Holtburg Sewer is the M1.5 physics
    acceptance test (deferred to A6.P4 after fixes land).

After all 9 captures: proceed to A6.P2 analysis per the design spec
docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md
§4. Note: Finding 2 (CP-write blowup) is already evidence-confirmed
from scen1; A6.P2 just needs to quantify + extend across scenarios.

References

Design spec: docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md
Implementation plan: docs/superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md
cdb script: tools/cdb/a6-probe.cdb (v4)
cdb runner: tools/cdb/a6-probe-runner.ps1
Type dumper: tools/cdb/a6-types-dump.cdb + a6-types-dump.txt (PDB-extracted offsets)
Hex decoder: tools/cdb/decode_retail_hex.py
Scen1 retail: docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log + retail.decoded.log
Scen1 acdream: docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log
Findings doc stub (to be filled by A6.P2): docs/research/2026-05-21-a6-cdb-capture-findings.md

17 KiB Raw Blame History Unescape Escape