acdream/docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md
Erik f9214433c3 docs(spec): Phase A6 — Indoor physics fidelity (cdb-driven) — design
Brainstormed + approved 2026-05-21 for M1.5 milestone work. Designs
the cdb probe spike methodology (7 retail breakpoints + new
[push-back] probe) to capture retail's per-tick BSP collision
response state at 9 indoor scenarios (4 buildings + 5 dungeon sites)
and compare against acdream. Working hypothesis: BSPQuery.AdjustSphereToPlane
or its callers over-correct vs retail, producing the family of
indoor symptoms (walls walk through, ping-pong, vibration, multi-Z
falling) plus driving the existing #90 + TryFindIndoorWalkablePlane
workarounds. A6 ships in 4 slices: P1 probe spike, P2 analysis,
P3 surgical fixes, P4 workaround removal + acceptance.

Phase O (DatPath Unification) pre-empted M1.5 and shipped 2026-05-21;
A6 resumes from Phase O state. Phase O only touched rendering/dat
code; indoor physics design is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 17:57:48 +02:00

21 KiB
Raw Blame History

Phase A6 — Indoor physics fidelity (cdb-driven) — design

Status: Active — brainstormed + approved 2026-05-21. Implementation starts at A6.P1 (cdb probe spike).

Milestone: M1.5 — "Indoor world feels right." Active per docs/plans/2026-05-12-milestones.md.

Pre-empted by: Phase O — DatPath Unification (shipped 2256006, 2026-05-21). A6 resumes from M1.5 baseline + Phase O state. Phase O only touched rendering / dat / scenery code; indoor physics code is unchanged.

Related:


1. Phase shape and inputs

1.1 Goal

Identify and fix the underlying BSP collision response divergence(s) that produce the family of indoor symptoms (walls walk through, CellId ping-pong, vibration, multi-Z falling, stairs walk-through) at Holtburg inn, cottages, and the Holtburg Sewer dungeon. Remove the two known workarounds (#90 sphere-overlap stickiness in PhysicsEngine.ResolveCellId; Transition.TryFindIndoorWalkablePlane per-frame ContactPlane synthesis) as part of the same phase.

1.2 Hypothesis (single-cause)

Our BSPQuery.AdjustSphereToPlane — or one of its callers in the 6-path dispatcher (BSPQuery.FindCollisions) — over-corrects by more than retail when resolving wall collisions. The over-correction pushes the sphere center OUT of the cell on partial collisions, which causes (a) ping-pong [#90 workaround masks], (b) ContactPlane invalidation [TryFindIndoorWalkablePlane workaround masks], (c) vibration [#88 — multi-tick push-back oscillation], and (d) walk-through [large pushes overshoot the wall]. If true, one surgical fix in A6.P3 closes all four symptoms and A6.P4 removes both workarounds.

1.3 Hypothesis (multi-cause backup)

The symptoms have distinct causes — one in the cell-list / find_cell_list pipeline (the question retail's point-in-cell raises for our sphere-overlap workaround), one in the BSP correction paths, one in sub-step state mutation. A6.P1's wide-net capture surfaces all of them; A6.P3 ships one PR per identified bug. A6 is structured to be true regardless of which hypothesis the data validates — the probe spike methodology and acceptance criteria don't change based on whether one or N fixes land.

1.4 Inputs

Input Source
Retail oracle (Sept 2013 EoR build, full PDB symbols) docs/research/named-retail/acclient_2013_pseudo_c.txt
Matching retail binary (v11.4186) C:\Turbine\Asheron's Call\acclient.exe, GUID-verified vs refs/acclient.pdb
cdb toolchain C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe. Workflow documented in CLAUDE.md
Existing acdream probes [indoor-bsp], [cell-transit], [cell-cache], [cp-write], [walk-miss] — all in PhysicsDiagnostics
New probe (to ship in A6.P1) [push-back] — three sites in BSPQuery + Transition
Retail decomp anchors (read during brainstorm) check_other_cells (272717), step_up (273099), transitional_insert Collide branch (273193), find_cell_list Position-variant (308742), sphere_intersects_cell (317666), adjust_sphere_to_plane (322032), find_collisions 6-path dispatcher (323725), find_walkable (326211), set_collide (321594)

2. Slice structure

Slice Duration Outputs
A6.P1 — cdb probe spike + [push-back] build ~3 days New [push-back] probe shipped; cdb script committed at tools/cdb/a6-probe.cdb; 9 paired captures (retail+acdream) at docs/research/2026-05-21-a6-captures/scenN-<scenario>/{retail,acdream}.log; docs/research/2026-05-21-a6-cdb-capture-findings.md (the quantitative findings table).
A6.P2 — Analysis report ~1 day Findings doc lists each bug candidate with: retail decomp anchor, our suspect code site, divergence quantified (e.g. "push-back at site X: retail mean=0.4 mm, ours mean=18 mm; 45× over"), proposed fix sketch.
A6.P3 — Surgical fixes ~35 days One PR per identified bug. Each PR ships: the fix to the suspect code site, a unit test using golden values from A6.P1 captures as the regression anchor, visual verification at the scenario that surfaced the bug.
A6.P4 — Workaround removal + acceptance ~1 day Pure-revert #90 commit (4ca3596). Delete Transition.TryFindIndoorWalkablePlane + its call site. Run all 9 scenarios as acceptance walks. Update CLAUDE.md baseline paragraph. Update milestones doc M1.5 partial-progress writeup.

Total estimated phase duration: ~811 days focused work.


3. A6.P1 — cdb probe spike methodology

3.1 cdb breakpoint set (the wide-net script)

Seven breakpoints. All actions are non-blocking (gc after dump). Auto-detach via qd at hit threshold to avoid manual cleanup. Per-breakpoint action output stays under 200 bytes to mitigate retail-side lag (CLAUDE.md gotcha — high BP hit rates trigger ACE timeout).

# Symbol Captures Hit-rate estimate
1 acclient!CTransition::transitional_insert Sub-step loop entry: eax_2 (sub-step count), sphere_path.check_pos (target), sphere_path.curr_pos (current), sphere_path.insert_type ~30 Hz × ~3 sub-steps/tick = ~100/sec
2 acclient!CTransition::step_up Path 5 step-up entry: arg2 (step-up normal), sphere_path.walkable_allowance (cone angle) Burst-only during stair ascent
3 acclient!SPHEREPATH::set_collide Wall-collision halt: arg2 (collision normal), this->backup_check_pos Burst during wall touches
4 acclient!BSPTREE::find_collisions 6-path dispatcher: arg3 (walkable_allowance), eax->sphere_path.collide, state flag, insert_type. Dumps return value via gu;r eax to identify which path fired. ~150/sec under motion
5 acclient!CPolygon::adjust_sphere_to_plane The over-correction suspect. Input: arg3->center (pre-push-back), this->plane.N, this->plane.d, arg2->walk_interp. Output (after gu): arg3->center (post-push-back), arg2->walk_interp. Yields per-call push-back delta. Burst during collisions
6 acclient!CTransition::validate_walkable Ground-plane verdict: arg2, sphere_path.walkable, return value ~30 Hz when grounded
7 acclient!CollisionInfo::set_contact_plane CP writes: arg2 (plane), arg3 (water flag), one-frame backtrace for caller ~30 Hz

Auto-detach threshold: 50,000 total hits across all breakpoints OR scenario-end marker (user runs over a specific spot — e.g. a torch). cdb logs to scenario-tagged file via .logopen <path> at start.

3.2 acdream-side mirror — the new [push-back] probe

AcDream.Core.Physics.PhysicsDiagnostics.ProbePushBackEnabled (env: ACDREAM_PROBE_PUSH_BACK=1, DebugPanel mirror under "Diagnostics").

Fires at three sites:

Site Location Per-call line fields
BSPQuery.AdjustSphereToPlane BSPQuery.cs:332 siteId=adjust_sphere, input center, plane.N, plane.D, radius, walkInterp (pre), dpPos, dpMove, iDist; output center (post), walkInterp (post), applied (true/false), cellId, polyId (from BSPQuery.LastBspHitPoly side-channel). Direct comparison to retail BP #5.
BSPQuery.FindCollisions entry BSPQuery.cs:1550 + BSPQuery.cs:1895 siteId=dispatch, state flags, pathTaken (1-7 mapped from return value), walkInterp (entry), collide, insertType, return state. Direct comparison to retail BP #4.
Transition.CheckOtherCellsApplyOtherCellResult TransitionTypes.cs:1614+ siteId=other_cell, off-cell transition outcome, multi-cell BSP iteration result. Paired with A4's multi-cell BSP.

All three respect existing logging conventions (timestamp prefix, off when disabled, zero-cost when off — checked via if (!ProbePushBackEnabled) return; early-out at each site).

3.3 Capture pairing protocol per scenario

For each of the 9 scenarios:

  1. Setup phase (~30 sec):

    • User opens retail client, navigates to the scenario start point (e.g., outside Holtburg inn, facing the doorway). Stops.
    • cdb attaches via tools\cdb\a6-probe.cdb with scenario-tagged log (scen1_inn_doorway.log).
    • Separately, user launches acdream with ACDREAM_PROBE_PUSH_BACK=1
      • ACDREAM_PROBE_INDOOR_BSP=1 + ACDREAM_PROBE_CELL=1
      • ACDREAM_PROBE_CELL_CACHE=1 + ACDREAM_PROBE_CONTACT_PLANE=1.
    • Navigates +Acdream to the same scenario start. Stops.
  2. Capture phase (~30 sec per pass):

    • User performs the SAME scripted walk in BOTH clients (per-scenario scripts in the A6.P1 plan). E.g. "walk forward 2 meters, sidestep right 1 meter, walk forward 2 meters."
    • cdb fires breakpoints, logs to scenario file. acdream emits [push-back] lines to launch.log.
  3. Teardown phase:

    • cdb auto-detaches at hit threshold OR user triggers scenario-end marker.
    • Both logs filed to docs/research/2026-05-21-a6-captures/scenN-<scenario>/{retail,acdream}.log.

Total time estimate: ~5 min/scenario × 9 = 45 min user time at retail. Plus ~30 min capture-setup overhead = ~75 min single session. Captures can be split across days; each scenario's pair is self-contained.

3.4 The 9 scenarios

# Scenario Location Walk script
1 Inn doorway entry Holtburg town inn front door Walk forward through door, stop just inside
2 Inn stairs ascent Holtburg inn interior, stairs to 2nd floor Walk up 4 steps, stop on landing
3 Inn 2nd-floor walking Holtburg inn 2nd floor Walk forward 3 m, sidestep 1 m, walk back
4 Cottage cellar entry Holtburg cottage with cellar Walk to cellar opening, descend 2 steps
5 Sewer entry portal Holtburg sewer entrance (in-town building stab) Walk into portal, then walk 2 m forward inside
6 Sewer first stair descent First stair after entry portal Walk down full stair flight
7 Inter-room portal transition Between any two sewer rooms via portal Walk through portal, stop 1 m past
8 Open central chamber (multi-Z) Sewer's multi-Z room Walk in, traverse center, walk out other side
9 Dark corridor Sewer narrow corridor Walk full length end-to-end

Order matters: 1→4 are buildings (smaller cells, simpler geometry), 5→9 are dungeons (larger cells, more portals, multi-Z). Capturing buildings first lets us verify the probe is producing usable data before committing to the dungeon traversal.


4. A6.P2 — analysis pipeline

Output: docs/research/2026-05-21-a6-cdb-capture-findings.md. Single document, four mandatory tables plus a per-scenario narrative.

4.1 Table 1 — Per-site push-back delta (the smoking gun)

Site Scenario Retail mean delta (mm) Retail p99 (mm) acdream mean (mm) acdream p99 (mm) Ratio

Rows = (site × scenario) cross-product. Delta computed as ‖output_center input_center‖ per call. If our ratio is > 3× retail anywhere, that's the bug candidate. Surfaces over-correction in a single column.

4.2 Table 2 — Path-frequency diff

Scenario Path Retail count acdream count Diff %

Paths labeled 17 per the find_collisions dispatcher (PLACEMENT_INSERT, check_walkable, step_down, collide_with_pt, set_collide+slid, step_sphere_up, find_walkable). Surfaces divergent path selection (e.g. "we fire Path 5 step_up where retail fires Path 6 set_collide").

4.3 Table 3 — ContactPlane lifecycle diff

Scenario Retail CP writes/sec acdream CP writes/sec Retail CP-restore-from-LKCP/sec acdream CP-restore/sec

Surfaces the per-frame CP-resynthesis pattern that TryFindIndoorWalkablePlane is masking.

4.4 Table 4 — Sub-step state mutations

Scenario Field Retail mutations/sec acdream mutations/sec

Fields = cell_array_valid, hits_interior_cell, walk_interp, walkable, collide. Surfaces stale-state across sub-steps (vibration / #88 family).

4.5 Per-scenario narrative

For each scenario, one paragraph describing what the trace shows frame-by-frame. Include a side-by-side trace excerpt at the sub-step where the divergence is sharpest.

4.6 Findings section

Numbered bug candidates. Each entry contains:

  • Title
  • Retail decomp anchor (line in acclient_2013_pseudo_c.txt)
  • Our suspect code site (file + line)
  • Divergence quantified (e.g. "push-back at site X: retail mean=0.4 mm, ours mean=18 mm; 45× over")
  • Proposed fix sketch (1-3 paragraphs)
  • Scenarios affected (which of the 9 reproduce this bug)

4.7 Acceptance for A6.P2

Every M1.5-in-scope symptom (#83, #88, #90, stairs walk-through, 2nd-floor walking, cellar descent, TryFindIndoorWalkablePlane MISS) maps to at least one bug candidate, OR is explicitly flagged as "not surfaced by this capture — defer to A8 / promote scope".


5. A6.P3 — fix surface

5.1 Sequencing

Multi-PR per bug. Each PR ships independently with its own commit message + visual verification gate. Reasoning: per-bug attribution is clearer; bisecting future regressions is easier; one PR rollback doesn't take all fixes down.

Order: highest-confidence single-cause fix first (probably AdjustSphereToPlane if Table 1 confirms over-correction). Re-run the 9-scenario probe spike AFTER each PR to verify (a) the targeted bug is closed, (b) no other symptom is worse. If the re-run shows multiple symptoms closed by the same fix, that's evidence for the single-cause hypothesis — file the other planned PRs as N/A and proceed to A6.P4.

5.2 Per-PR shape

Each PR ships:

  • The fix to the suspect code site (surgical change, narrow scope).
  • A unit test using golden values captured during A6.P1 as the regression anchor. Example: "AdjustSphereToPlane at plane N=(0,0,1) d=100, sphere center=(50,50,99.5), movement=(0,0,0.5), radius=0.6 → output center.Z = 99.4 ± 0.1 mm matching retail capture line 47".
  • Visual verification at the scenario that surfaced the bug.

5.3 Likely-touched files

Based on hypothesis + decomp reading. The actual fix surface is whatever A6.P2 surfaces.

File Likely touch
src/AcDream.Core/Physics/BSPQuery.cs AdjustSphereToPlane, 6-path dispatcher entry, Path 5 (step_up) / Path 6 (set_collide) branches
src/AcDream.Core/Physics/TransitionTypes.cs Transition.FindEnvCollisions indoor branch, ApplyOtherCellResult state reset
src/AcDream.Core/Physics/CellTransit.cs FindCellList if path-selection diverges from retail's point-in-cell
src/AcDream.Core/Physics/PhysicsEngine.cs ResolveCellId only via A6.P4 removal of the #90 workaround — no functional change in A6.P3

5.4 Commit message convention

fix(physics): A6.P3 — <bug candidate N> — <one-line summary>. Body references the A6.P2 findings doc by anchor (§ findings.bug-1 etc.).


6. A6.P4 — workaround removal + acceptance

6.1 Workaround 1 — Issue #90 sphere-overlap stickiness

Location: src/AcDream.Core/Physics/PhysicsEngine.cs:285-300.

Removal: pure revert of 4ca3596. The BSPQuery.SphereIntersectsCellBsp helper itself stays — it's also used by #89's CheckBuildingTransit which IS retail-faithful per the sphere_intersects_cell decomp at acclient_2013_pseudo_c.txt:317666.

Verification: with A6.P3's fix(es) landed, walk Holtburg inn doorway — observe NO [cell-transit] ping-pong, no walk-through.

Tripwire: add a regression test at tests/AcDream.Core.Tests/Physics/PhysicsEngineResolveCellIdTests.cs constructing a sphere at the inn-doorway geometry post-push-back, calls ResolveCellId, asserts the indoor CellId is preserved. Golden values captured from A6.P1 scenario #1.

6.2 Workaround 2 — TryFindIndoorWalkablePlane synthesis

Location: src/AcDream.Core/Physics/TransitionTypes.cs:1294-1373 (method body) + caller at :1519.

Removal: delete method body + caller block. The three CP-retention mechanisms — A (Path 6 land write at acclient_2013_pseudo_c.txt:323924), B (validate_transition LKCP proximity restore at :272565), C (post-OK step-down probe at :273242) — must catch the player without the synthesis.

Verification: walk Holtburg cottage doorway threshold (the case that broke Bug A's revert on 2026-05-20). Walk Holtburg sewer first stair descent. With A6.P3's fix(es) landed, observe NO free-fall through doorway threshold, NO falling-stuck on stair descent.

Tripwire: existing walk-miss probe stays enabled. MISS rate must drop to <5% (current: 99.87% per docs/research/2026-05-21-walk-miss-capture-findings.md).

6.3 M1.5-physics acceptance criteria

All five must hold for A6 to be marked complete:

  1. All 9 scenarios walk cleanly with NO probe warnings.
  2. walk-miss MISS rate < 5% across a 60-sec wander.
  3. [cell-transit] log shows zero ping-pong events in the 60-sec wander.
  4. Holtburg Sewer end-to-end walk (entry → 57 rooms → exit) without getting stuck, without walking through any wall, without falling through any stair. This is the M1.5 physics acceptance criterion.
  5. M0 + M1 outdoor regression: walk Holtburg outdoor for 60 sec, observe no regressions in outdoor cell handling, no FPS drop, baseline test suite still 1147 + 8 (or whatever post-Phase O baseline is — to be re-measured at A6.P1 start).

6.4 Three-failed-verifications policy

Per CLAUDE.md Visual verification rule: if three consecutive visual verifications fail at the same scenario after attempted fixes, stop and write a handoff doc. Do not attempt a fourth fix on the same symptom in the same session. Hand off with full reproduction notes + probe captures + the failed-fix code diffs.


7. Out of scope

  • A7 (Indoor lighting fidelity) — separate phase, separate methodology (RenderDoc + retail-decomp), follows A6. Per CLAUDE.md, do NOT mix lighting work into A6. If A6.P1 reveals a lighting cause for an apparent physics symptom, file as an A7 issue and continue A6.
  • A2 — polygon_hits_sphere_slow_but_sure tangent-epsilon rejection. Separate issue; A6.P4's walk-miss MISS rate target tolerates the residual A2-class events as long as they're < 5% of all calls.
  • Outdoor physics regression — A6 is indoor-focused. Outdoor walks appear in acceptance criteria only as a regression backstop, not as a fix surface. Any outdoor-physics findings in A6.P1 capture get filed as separate issues for post-A6 work.
  • Combat / animation / movement networking — completely orthogonal. M2's prerequisites.

8. References

8.1 Retail decomp anchors

All in docs/research/named-retail/acclient_2013_pseudo_c.txt:

  • :272717-272798CTransition::check_other_cells (A4 oracle, already ported)
  • :273099-273133CTransition::step_up
  • :273193-273239CTransition::transitional_insert Collide branch
  • :308742-308819CObjCell::find_cell_list Position-variant (the hysteresis question for #90's root cause)
  • :317657-317671CCellStruct::point_in_cell + sphere_intersects_cell
  • :321594-321607SPHEREPATH::set_collide
  • :322032-322077CPolygon::adjust_sphere_to_plane (suspected over-correction site)
  • :322974-322993CPolygon::pos_hits_sphere (front-face culling)
  • :323725-323939BSPTREE::find_collisions (full 6-path dispatcher)
  • :326211-326242BSPNODE::find_walkable

8.2 Our suspect code sites

  • src/AcDream.Core/Physics/BSPQuery.cs:332AdjustSphereToPlane
  • src/AcDream.Core/Physics/BSPQuery.cs:1550, :1895FindCollisions (two overloads)
  • src/AcDream.Core/Physics/PhysicsEngine.cs:285-300ResolveCellId #90 workaround block
  • src/AcDream.Core/Physics/TransitionTypes.cs:1294-1373TryFindIndoorWalkablePlane synthesis (workaround)
  • src/AcDream.Core/Physics/PhysicsDiagnostics.cs — probe infrastructure (where [push-back] lives)

8.3 Prior handoffs