Brainstormed + approved 2026-05-21 for M1.5 milestone work. Designs the cdb probe spike methodology (7 retail breakpoints + new [push-back] probe) to capture retail's per-tick BSP collision response state at 9 indoor scenarios (4 buildings + 5 dungeon sites) and compare against acdream. Working hypothesis: BSPQuery.AdjustSphereToPlane or its callers over-correct vs retail, producing the family of indoor symptoms (walls walk through, ping-pong, vibration, multi-Z falling) plus driving the existing #90 + TryFindIndoorWalkablePlane workarounds. A6 ships in 4 slices: P1 probe spike, P2 analysis, P3 surgical fixes, P4 workaround removal + acceptance. Phase O (DatPath Unification) pre-empted M1.5 and shipped 2026-05-21; A6 resumes from Phase O state. Phase O only touched rendering/dat code; indoor physics design is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
21 KiB
Phase A6 — Indoor physics fidelity (cdb-driven) — design
Status: Active — brainstormed + approved 2026-05-21. Implementation starts at A6.P1 (cdb probe spike).
Milestone: M1.5 — "Indoor world feels right." Active per
docs/plans/2026-05-12-milestones.md.
Pre-empted by: Phase O — DatPath Unification (shipped 2256006,
2026-05-21). A6 resumes from M1.5 baseline + Phase O state. Phase O only
touched rendering / dat / scenery code; indoor physics code is unchanged.
Related:
docs/plans/2026-04-11-roadmap.md— A6 entry under "Phases ahead".docs/research/2026-05-20-m15-kickoff-handoff.md— M1.5 baseline + workaround inventory.CLAUDE.md— "Retail debugger toolchain" section (cdb workflow).
1. Phase shape and inputs
1.1 Goal
Identify and fix the underlying BSP collision response divergence(s) that
produce the family of indoor symptoms (walls walk through, CellId
ping-pong, vibration, multi-Z falling, stairs walk-through) at Holtburg
inn, cottages, and the Holtburg Sewer dungeon. Remove the two known
workarounds (#90 sphere-overlap stickiness in
PhysicsEngine.ResolveCellId; Transition.TryFindIndoorWalkablePlane
per-frame ContactPlane synthesis) as part of the same phase.
1.2 Hypothesis (single-cause)
Our BSPQuery.AdjustSphereToPlane — or one of its callers in the 6-path
dispatcher (BSPQuery.FindCollisions) — over-corrects by more than retail
when resolving wall collisions. The over-correction pushes the sphere
center OUT of the cell on partial collisions, which causes (a) ping-pong
[#90 workaround masks], (b) ContactPlane invalidation
[TryFindIndoorWalkablePlane workaround masks], (c) vibration [#88 —
multi-tick push-back oscillation], and (d) walk-through [large pushes
overshoot the wall]. If true, one surgical fix in A6.P3 closes all four
symptoms and A6.P4 removes both workarounds.
1.3 Hypothesis (multi-cause backup)
The symptoms have distinct causes — one in the cell-list / find_cell_list
pipeline (the question retail's point-in-cell raises for our sphere-overlap
workaround), one in the BSP correction paths, one in sub-step state
mutation. A6.P1's wide-net capture surfaces all of them; A6.P3 ships
one PR per identified bug. A6 is structured to be true regardless of
which hypothesis the data validates — the probe spike methodology and
acceptance criteria don't change based on whether one or N fixes land.
1.4 Inputs
| Input | Source |
|---|---|
| Retail oracle (Sept 2013 EoR build, full PDB symbols) | docs/research/named-retail/acclient_2013_pseudo_c.txt |
| Matching retail binary (v11.4186) | C:\Turbine\Asheron's Call\acclient.exe, GUID-verified vs refs/acclient.pdb |
| cdb toolchain | C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe. Workflow documented in CLAUDE.md |
| Existing acdream probes | [indoor-bsp], [cell-transit], [cell-cache], [cp-write], [walk-miss] — all in PhysicsDiagnostics |
| New probe (to ship in A6.P1) | [push-back] — three sites in BSPQuery + Transition |
| Retail decomp anchors (read during brainstorm) | check_other_cells (272717), step_up (273099), transitional_insert Collide branch (273193), find_cell_list Position-variant (308742), sphere_intersects_cell (317666), adjust_sphere_to_plane (322032), find_collisions 6-path dispatcher (323725), find_walkable (326211), set_collide (321594) |
2. Slice structure
| Slice | Duration | Outputs |
|---|---|---|
A6.P1 — cdb probe spike + [push-back] build |
~3 days | New [push-back] probe shipped; cdb script committed at tools/cdb/a6-probe.cdb; 9 paired captures (retail+acdream) at docs/research/2026-05-21-a6-captures/scenN-<scenario>/{retail,acdream}.log; docs/research/2026-05-21-a6-cdb-capture-findings.md (the quantitative findings table). |
| A6.P2 — Analysis report | ~1 day | Findings doc lists each bug candidate with: retail decomp anchor, our suspect code site, divergence quantified (e.g. "push-back at site X: retail mean=0.4 mm, ours mean=18 mm; 45× over"), proposed fix sketch. |
| A6.P3 — Surgical fixes | ~3–5 days | One PR per identified bug. Each PR ships: the fix to the suspect code site, a unit test using golden values from A6.P1 captures as the regression anchor, visual verification at the scenario that surfaced the bug. |
| A6.P4 — Workaround removal + acceptance | ~1 day | Pure-revert #90 commit (4ca3596). Delete Transition.TryFindIndoorWalkablePlane + its call site. Run all 9 scenarios as acceptance walks. Update CLAUDE.md baseline paragraph. Update milestones doc M1.5 partial-progress writeup. |
Total estimated phase duration: ~8–11 days focused work.
3. A6.P1 — cdb probe spike methodology
3.1 cdb breakpoint set (the wide-net script)
Seven breakpoints. All actions are non-blocking (gc after dump). Auto-detach
via qd at hit threshold to avoid manual cleanup. Per-breakpoint action
output stays under 200 bytes to mitigate retail-side lag (CLAUDE.md
gotcha — high BP hit rates trigger ACE timeout).
| # | Symbol | Captures | Hit-rate estimate |
|---|---|---|---|
| 1 | acclient!CTransition::transitional_insert |
Sub-step loop entry: eax_2 (sub-step count), sphere_path.check_pos (target), sphere_path.curr_pos (current), sphere_path.insert_type |
~30 Hz × ~3 sub-steps/tick = ~100/sec |
| 2 | acclient!CTransition::step_up |
Path 5 step-up entry: arg2 (step-up normal), sphere_path.walkable_allowance (cone angle) |
Burst-only during stair ascent |
| 3 | acclient!SPHEREPATH::set_collide |
Wall-collision halt: arg2 (collision normal), this->backup_check_pos |
Burst during wall touches |
| 4 | acclient!BSPTREE::find_collisions |
6-path dispatcher: arg3 (walkable_allowance), eax->sphere_path.collide, state flag, insert_type. Dumps return value via gu;r eax to identify which path fired. |
~150/sec under motion |
| 5 | acclient!CPolygon::adjust_sphere_to_plane |
The over-correction suspect. Input: arg3->center (pre-push-back), this->plane.N, this->plane.d, arg2->walk_interp. Output (after gu): arg3->center (post-push-back), arg2->walk_interp. Yields per-call push-back delta. |
Burst during collisions |
| 6 | acclient!CTransition::validate_walkable |
Ground-plane verdict: arg2, sphere_path.walkable, return value |
~30 Hz when grounded |
| 7 | acclient!CollisionInfo::set_contact_plane |
CP writes: arg2 (plane), arg3 (water flag), one-frame backtrace for caller |
~30 Hz |
Auto-detach threshold: 50,000 total hits across all breakpoints OR
scenario-end marker (user runs over a specific spot — e.g. a torch). cdb
logs to scenario-tagged file via .logopen <path> at start.
3.2 acdream-side mirror — the new [push-back] probe
AcDream.Core.Physics.PhysicsDiagnostics.ProbePushBackEnabled (env:
ACDREAM_PROBE_PUSH_BACK=1, DebugPanel mirror under "Diagnostics").
Fires at three sites:
| Site | Location | Per-call line fields |
|---|---|---|
BSPQuery.AdjustSphereToPlane |
BSPQuery.cs:332 |
siteId=adjust_sphere, input center, plane.N, plane.D, radius, walkInterp (pre), dpPos, dpMove, iDist; output center (post), walkInterp (post), applied (true/false), cellId, polyId (from BSPQuery.LastBspHitPoly side-channel). Direct comparison to retail BP #5. |
BSPQuery.FindCollisions entry |
BSPQuery.cs:1550 + BSPQuery.cs:1895 |
siteId=dispatch, state flags, pathTaken (1-7 mapped from return value), walkInterp (entry), collide, insertType, return state. Direct comparison to retail BP #4. |
Transition.CheckOtherCells → ApplyOtherCellResult |
TransitionTypes.cs:1614+ |
siteId=other_cell, off-cell transition outcome, multi-cell BSP iteration result. Paired with A4's multi-cell BSP. |
All three respect existing logging conventions (timestamp prefix, off
when disabled, zero-cost when off — checked via if (!ProbePushBackEnabled) return; early-out at each site).
3.3 Capture pairing protocol per scenario
For each of the 9 scenarios:
-
Setup phase (~30 sec):
- User opens retail client, navigates to the scenario start point (e.g., outside Holtburg inn, facing the doorway). Stops.
- cdb attaches via
tools\cdb\a6-probe.cdbwith scenario-tagged log (scen1_inn_doorway.log). - Separately, user launches acdream with
ACDREAM_PROBE_PUSH_BACK=1ACDREAM_PROBE_INDOOR_BSP=1+ACDREAM_PROBE_CELL=1ACDREAM_PROBE_CELL_CACHE=1+ACDREAM_PROBE_CONTACT_PLANE=1.
- Navigates
+Acdreamto the same scenario start. Stops.
-
Capture phase (~30 sec per pass):
- User performs the SAME scripted walk in BOTH clients (per-scenario scripts in the A6.P1 plan). E.g. "walk forward 2 meters, sidestep right 1 meter, walk forward 2 meters."
- cdb fires breakpoints, logs to scenario file. acdream emits
[push-back]lines tolaunch.log.
-
Teardown phase:
- cdb auto-detaches at hit threshold OR user triggers scenario-end marker.
- Both logs filed to
docs/research/2026-05-21-a6-captures/scenN-<scenario>/{retail,acdream}.log.
Total time estimate: ~5 min/scenario × 9 = 45 min user time at retail. Plus ~30 min capture-setup overhead = ~75 min single session. Captures can be split across days; each scenario's pair is self-contained.
3.4 The 9 scenarios
| # | Scenario | Location | Walk script |
|---|---|---|---|
| 1 | Inn doorway entry | Holtburg town inn front door | Walk forward through door, stop just inside |
| 2 | Inn stairs ascent | Holtburg inn interior, stairs to 2nd floor | Walk up 4 steps, stop on landing |
| 3 | Inn 2nd-floor walking | Holtburg inn 2nd floor | Walk forward 3 m, sidestep 1 m, walk back |
| 4 | Cottage cellar entry | Holtburg cottage with cellar | Walk to cellar opening, descend 2 steps |
| 5 | Sewer entry portal | Holtburg sewer entrance (in-town building stab) | Walk into portal, then walk 2 m forward inside |
| 6 | Sewer first stair descent | First stair after entry portal | Walk down full stair flight |
| 7 | Inter-room portal transition | Between any two sewer rooms via portal | Walk through portal, stop 1 m past |
| 8 | Open central chamber (multi-Z) | Sewer's multi-Z room | Walk in, traverse center, walk out other side |
| 9 | Dark corridor | Sewer narrow corridor | Walk full length end-to-end |
Order matters: 1→4 are buildings (smaller cells, simpler geometry), 5→9 are dungeons (larger cells, more portals, multi-Z). Capturing buildings first lets us verify the probe is producing usable data before committing to the dungeon traversal.
4. A6.P2 — analysis pipeline
Output: docs/research/2026-05-21-a6-cdb-capture-findings.md. Single
document, four mandatory tables plus a per-scenario narrative.
4.1 Table 1 — Per-site push-back delta (the smoking gun)
| Site | Scenario | Retail mean delta (mm) | Retail p99 (mm) | acdream mean (mm) | acdream p99 (mm) | Ratio |
|---|
Rows = (site × scenario) cross-product. Delta computed as
‖output_center − input_center‖ per call. If our ratio is > 3× retail
anywhere, that's the bug candidate. Surfaces over-correction in a
single column.
4.2 Table 2 — Path-frequency diff
| Scenario | Path | Retail count | acdream count | Diff % |
|---|
Paths labeled 1–7 per the find_collisions dispatcher (PLACEMENT_INSERT,
check_walkable, step_down, collide_with_pt, set_collide+slid,
step_sphere_up, find_walkable). Surfaces divergent path selection (e.g.
"we fire Path 5 step_up where retail fires Path 6 set_collide").
4.3 Table 3 — ContactPlane lifecycle diff
| Scenario | Retail CP writes/sec | acdream CP writes/sec | Retail CP-restore-from-LKCP/sec | acdream CP-restore/sec |
|---|
Surfaces the per-frame CP-resynthesis pattern that
TryFindIndoorWalkablePlane is masking.
4.4 Table 4 — Sub-step state mutations
| Scenario | Field | Retail mutations/sec | acdream mutations/sec |
|---|
Fields = cell_array_valid, hits_interior_cell, walk_interp,
walkable, collide. Surfaces stale-state across sub-steps (vibration
/ #88 family).
4.5 Per-scenario narrative
For each scenario, one paragraph describing what the trace shows frame-by-frame. Include a side-by-side trace excerpt at the sub-step where the divergence is sharpest.
4.6 Findings section
Numbered bug candidates. Each entry contains:
- Title
- Retail decomp anchor (line in
acclient_2013_pseudo_c.txt) - Our suspect code site (file + line)
- Divergence quantified (e.g. "push-back at site X: retail mean=0.4 mm, ours mean=18 mm; 45× over")
- Proposed fix sketch (1-3 paragraphs)
- Scenarios affected (which of the 9 reproduce this bug)
4.7 Acceptance for A6.P2
Every M1.5-in-scope symptom (#83, #88, #90, stairs walk-through,
2nd-floor walking, cellar descent, TryFindIndoorWalkablePlane MISS)
maps to at least one bug candidate, OR is explicitly flagged as
"not surfaced by this capture — defer to A8 / promote scope".
5. A6.P3 — fix surface
5.1 Sequencing
Multi-PR per bug. Each PR ships independently with its own commit message + visual verification gate. Reasoning: per-bug attribution is clearer; bisecting future regressions is easier; one PR rollback doesn't take all fixes down.
Order: highest-confidence single-cause fix first (probably
AdjustSphereToPlane if Table 1 confirms over-correction). Re-run the
9-scenario probe spike AFTER each PR to verify (a) the targeted bug is
closed, (b) no other symptom is worse. If the re-run shows multiple
symptoms closed by the same fix, that's evidence for the single-cause
hypothesis — file the other planned PRs as N/A and proceed to A6.P4.
5.2 Per-PR shape
Each PR ships:
- The fix to the suspect code site (surgical change, narrow scope).
- A unit test using golden values captured during A6.P1 as the
regression anchor. Example: "
AdjustSphereToPlaneat plane N=(0,0,1) d=−100, sphere center=(50,50,99.5), movement=(0,0,−0.5), radius=0.6 → output center.Z = 99.4 ± 0.1 mm matching retail capture line 47". - Visual verification at the scenario that surfaced the bug.
5.3 Likely-touched files
Based on hypothesis + decomp reading. The actual fix surface is whatever A6.P2 surfaces.
| File | Likely touch |
|---|---|
src/AcDream.Core/Physics/BSPQuery.cs |
AdjustSphereToPlane, 6-path dispatcher entry, Path 5 (step_up) / Path 6 (set_collide) branches |
src/AcDream.Core/Physics/TransitionTypes.cs |
Transition.FindEnvCollisions indoor branch, ApplyOtherCellResult state reset |
src/AcDream.Core/Physics/CellTransit.cs |
FindCellList if path-selection diverges from retail's point-in-cell |
src/AcDream.Core/Physics/PhysicsEngine.cs |
ResolveCellId only via A6.P4 removal of the #90 workaround — no functional change in A6.P3 |
5.4 Commit message convention
fix(physics): A6.P3 — <bug candidate N> — <one-line summary>. Body
references the A6.P2 findings doc by anchor (§ findings.bug-1 etc.).
6. A6.P4 — workaround removal + acceptance
6.1 Workaround 1 — Issue #90 sphere-overlap stickiness
Location: src/AcDream.Core/Physics/PhysicsEngine.cs:285-300.
Removal: pure revert of 4ca3596. The BSPQuery.SphereIntersectsCellBsp
helper itself stays — it's also used by #89's CheckBuildingTransit
which IS retail-faithful per the sphere_intersects_cell decomp at
acclient_2013_pseudo_c.txt:317666.
Verification: with A6.P3's fix(es) landed, walk Holtburg inn doorway
— observe NO [cell-transit] ping-pong, no walk-through.
Tripwire: add a regression test at
tests/AcDream.Core.Tests/Physics/PhysicsEngineResolveCellIdTests.cs
constructing a sphere at the inn-doorway geometry post-push-back, calls
ResolveCellId, asserts the indoor CellId is preserved. Golden values
captured from A6.P1 scenario #1.
6.2 Workaround 2 — TryFindIndoorWalkablePlane synthesis
Location: src/AcDream.Core/Physics/TransitionTypes.cs:1294-1373
(method body) + caller at :1519.
Removal: delete method body + caller block. The three CP-retention
mechanisms — A (Path 6 land write at acclient_2013_pseudo_c.txt:323924),
B (validate_transition LKCP proximity restore at :272565), C (post-OK
step-down probe at :273242) — must catch the player without the
synthesis.
Verification: walk Holtburg cottage doorway threshold (the case that broke Bug A's revert on 2026-05-20). Walk Holtburg sewer first stair descent. With A6.P3's fix(es) landed, observe NO free-fall through doorway threshold, NO falling-stuck on stair descent.
Tripwire: existing walk-miss probe stays enabled. MISS rate must
drop to <5% (current: 99.87% per
docs/research/2026-05-21-walk-miss-capture-findings.md).
6.3 M1.5-physics acceptance criteria
All five must hold for A6 to be marked complete:
- All 9 scenarios walk cleanly with NO probe warnings.
walk-missMISS rate < 5% across a 60-sec wander.[cell-transit]log shows zero ping-pong events in the 60-sec wander.- Holtburg Sewer end-to-end walk (entry → 5–7 rooms → exit) without getting stuck, without walking through any wall, without falling through any stair. This is the M1.5 physics acceptance criterion.
- M0 + M1 outdoor regression: walk Holtburg outdoor for 60 sec, observe no regressions in outdoor cell handling, no FPS drop, baseline test suite still 1147 + 8 (or whatever post-Phase O baseline is — to be re-measured at A6.P1 start).
6.4 Three-failed-verifications policy
Per CLAUDE.md Visual verification rule: if three
consecutive visual verifications fail at the same scenario after
attempted fixes, stop and write a handoff doc. Do not attempt a
fourth fix on the same symptom in the same session. Hand off with full
reproduction notes + probe captures + the failed-fix code diffs.
7. Out of scope
- A7 (Indoor lighting fidelity) — separate phase, separate
methodology (RenderDoc + retail-decomp), follows A6. Per
CLAUDE.md, do NOT mix lighting work into A6. If A6.P1 reveals a lighting cause for an apparent physics symptom, file as an A7 issue and continue A6. - A2 —
polygon_hits_sphere_slow_but_suretangent-epsilon rejection. Separate issue; A6.P4'swalk-missMISS rate target tolerates the residual A2-class events as long as they're < 5% of all calls. - Outdoor physics regression — A6 is indoor-focused. Outdoor walks appear in acceptance criteria only as a regression backstop, not as a fix surface. Any outdoor-physics findings in A6.P1 capture get filed as separate issues for post-A6 work.
- Combat / animation / movement networking — completely orthogonal. M2's prerequisites.
8. References
8.1 Retail decomp anchors
All in docs/research/named-retail/acclient_2013_pseudo_c.txt:
:272717-272798—CTransition::check_other_cells(A4 oracle, already ported):273099-273133—CTransition::step_up:273193-273239—CTransition::transitional_insertCollide branch:308742-308819—CObjCell::find_cell_listPosition-variant (the hysteresis question for #90's root cause):317657-317671—CCellStruct::point_in_cell+sphere_intersects_cell:321594-321607—SPHEREPATH::set_collide:322032-322077—CPolygon::adjust_sphere_to_plane(suspected over-correction site):322974-322993—CPolygon::pos_hits_sphere(front-face culling):323725-323939—BSPTREE::find_collisions(full 6-path dispatcher):326211-326242—BSPNODE::find_walkable
8.2 Our suspect code sites
src/AcDream.Core/Physics/BSPQuery.cs:332—AdjustSphereToPlanesrc/AcDream.Core/Physics/BSPQuery.cs:1550,:1895—FindCollisions(two overloads)src/AcDream.Core/Physics/PhysicsEngine.cs:285-300—ResolveCellId#90 workaround blocksrc/AcDream.Core/Physics/TransitionTypes.cs:1294-1373—TryFindIndoorWalkablePlanesynthesis (workaround)src/AcDream.Core/Physics/PhysicsDiagnostics.cs— probe infrastructure (where[push-back]lives)
8.3 Prior handoffs
docs/research/2026-05-20-m15-kickoff-handoff.md— M1.5 baseline + workaround inventory.docs/research/2026-05-20-phase-a4-shipped-cell-pingpong-finding.md— A4 ship + #90 ping-pong investigation.docs/research/2026-05-21-walk-miss-capture-findings.md—TryFindIndoorWalkablePlaneMISS rate evidence (99.87%).docs/research/2026-05-20-indoor-walking-bug-a-handoff.md— Bug A's tried-and-reverted synthesis removal story.